mirror of
https://github.com/yt-dlp/yt-dlp.git
synced 2025-06-28 01:18:30 +00:00
Merge branch 'master' of https://github.com/yt-dlp/yt-dlp into fix/ie/EuroParlWebstream
This commit is contained in:
commit
7a29f46c5e
@ -758,3 +758,5 @@ somini
|
||||
thedenv
|
||||
vallovic
|
||||
arabcoders
|
||||
mireq
|
||||
mlabeeb03
|
||||
|
21
Changelog.md
21
Changelog.md
@ -4,6 +4,27 @@ # Changelog
|
||||
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
|
||||
-->
|
||||
|
||||
### 2025.03.31
|
||||
|
||||
#### Core changes
|
||||
- [Add `--compat-options 2024`](https://github.com/yt-dlp/yt-dlp/commit/22e34adbd741e1c7072015debd615dc3fb71c401) ([#12789](https://github.com/yt-dlp/yt-dlp/issues/12789)) by [seproDev](https://github.com/seproDev)
|
||||
|
||||
#### Extractor changes
|
||||
- **francaisfacile**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/bb321cfdc3fd4400598ddb12a15862bc2ac8fc10) ([#12787](https://github.com/yt-dlp/yt-dlp/issues/12787)) by [mlabeeb03](https://github.com/mlabeeb03)
|
||||
- **generic**: [Validate response before checking m3u8 live status](https://github.com/yt-dlp/yt-dlp/commit/9a1ec1d36e172d252714cef712a6d091e0a0c4f2) ([#12784](https://github.com/yt-dlp/yt-dlp/issues/12784)) by [bashonly](https://github.com/bashonly)
|
||||
- **microsoftlearnepisode**: [Extract more formats](https://github.com/yt-dlp/yt-dlp/commit/d63696f23a341ee36a3237ccb5d5e14b34c2c579) ([#12799](https://github.com/yt-dlp/yt-dlp/issues/12799)) by [bashonly](https://github.com/bashonly)
|
||||
- **mlbtv**: [Fix radio-only extraction](https://github.com/yt-dlp/yt-dlp/commit/f033d86b96b36f8c5289dd7c3304f42d4d9f6ff4) ([#12792](https://github.com/yt-dlp/yt-dlp/issues/12792)) by [bashonly](https://github.com/bashonly)
|
||||
- **on24**: [Support `mainEvent` URLs](https://github.com/yt-dlp/yt-dlp/commit/e465b078ead75472fcb7b86f6ccaf2b5d3bc4c21) ([#12800](https://github.com/yt-dlp/yt-dlp/issues/12800)) by [bashonly](https://github.com/bashonly)
|
||||
- **sbs**: [Fix subtitles extraction](https://github.com/yt-dlp/yt-dlp/commit/29560359120f28adaaac67c86fa8442eb72daa0d) ([#12785](https://github.com/yt-dlp/yt-dlp/issues/12785)) by [bashonly](https://github.com/bashonly)
|
||||
- **stvr**: [Rename extractor from RTVS to STVR](https://github.com/yt-dlp/yt-dlp/commit/5fc521cbd0ce7b2410d0935369558838728e205d) ([#12788](https://github.com/yt-dlp/yt-dlp/issues/12788)) by [mireq](https://github.com/mireq)
|
||||
- **twitch**: clips: [Extract portrait formats](https://github.com/yt-dlp/yt-dlp/commit/61046c31612b30c749cbdae934b7fe26abe659d7) ([#12763](https://github.com/yt-dlp/yt-dlp/issues/12763)) by [DmitryScaletta](https://github.com/DmitryScaletta)
|
||||
- **youtube**
|
||||
- [Add `player_js_variant` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/07f04005e40ebdb368920c511e36e98af0077ed3) ([#12767](https://github.com/yt-dlp/yt-dlp/issues/12767)) by [bashonly](https://github.com/bashonly)
|
||||
- tab: [Fix playlist continuation extraction](https://github.com/yt-dlp/yt-dlp/commit/6a6d97b2cbc78f818de05cc96edcdcfd52caa259) ([#12777](https://github.com/yt-dlp/yt-dlp/issues/12777)) by [coletdjnz](https://github.com/coletdjnz)
|
||||
|
||||
#### Misc. changes
|
||||
- **cleanup**: Miscellaneous: [5e457af](https://github.com/yt-dlp/yt-dlp/commit/5e457af57fae9645b1b8fa0ed689229c8fb9656b) by [bashonly](https://github.com/bashonly)
|
||||
|
||||
### 2025.03.27
|
||||
|
||||
#### Core changes
|
||||
|
@ -1782,6 +1782,7 @@ #### youtube
|
||||
* `data_sync_id`: Overrides the account Data Sync ID used in Innertube API requests. This may be needed if you are using an account with `youtube:player_skip=webpage,configs` or `youtubetab:skip=webpage`
|
||||
* `visitor_data`: Overrides the Visitor Data used in Innertube API requests. This should be used with `player_skip=webpage,configs` and without cookies. Note: this may have adverse effects if used improperly. If a session from a browser is wanted, you should pass cookies instead (which contain the Visitor ID)
|
||||
* `po_token`: Proof of Origin (PO) Token(s) to use. Comma seperated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be either `gvs` (Google Video Server URLs) or `player` (Innertube player request)
|
||||
* `player_js_variant`: The player javascript variant to use for signature and nsig deciphering. The known variants are: `main`, `tce`, `tv`, `tv_es6`, `phone`, `tablet`. Only `main` is recommended as a possible workaround; the others are for debugging purposes. The default is to use what is prescribed by the site, and can be selected with `actual`
|
||||
|
||||
#### youtubetab (YouTube playlists, channels, feeds, etc.)
|
||||
* `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details)
|
||||
@ -2218,7 +2219,7 @@ ### Differences in default behavior
|
||||
* Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading
|
||||
* YouTube channel URLs download all uploads of the channel. To download only the videos in a specific tab, pass the tab's URL. If the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections
|
||||
* Unavailable videos are also listed for YouTube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this
|
||||
* The upload dates extracted from YouTube are in UTC [when available](https://github.com/yt-dlp/yt-dlp/blob/89e4d86171c7b7c997c77d4714542e0383bf0db0/yt_dlp/extractor/youtube.py#L3898-L3900). Use `--compat-options no-youtube-prefer-utc-upload-date` to prefer the non-UTC upload date.
|
||||
* The upload dates extracted from YouTube are in UTC.
|
||||
* If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this
|
||||
* Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead
|
||||
* Some internal metadata such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this
|
||||
@ -2237,9 +2238,10 @@ ### Differences in default behavior
|
||||
* `--compat-options all`: Use all compat options (**Do NOT use this!**)
|
||||
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
|
||||
* `--compat-options youtube-dlc`: Same as `--compat-options all,-no-live-chat,-no-youtube-channel-redirect,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
|
||||
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization,no-youtube-prefer-utc-upload-date`
|
||||
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization`
|
||||
* `--compat-options 2022`: Same as `--compat-options 2023,playlist-match-filter,no-external-downloader-progress,prefer-legacy-http-handler,manifest-filesize-approx`
|
||||
* `--compat-options 2023`: Same as `--compat-options prefer-vp9-sort`. Use this to enable all future compat options
|
||||
* `--compat-options 2023`: Same as `--compat-options 2024,prefer-vp9-sort`
|
||||
* `--compat-options 2024`: Currently does nothing. Use this to enable all future compat options
|
||||
|
||||
The following compat options restore vulnerable behavior from before security patches:
|
||||
|
||||
|
@ -472,6 +472,7 @@ # Supported sites
|
||||
- **FoxNewsVideo**
|
||||
- **FoxSports**
|
||||
- **fptplay**: fptplay.vn
|
||||
- **FrancaisFacile**
|
||||
- **FranceCulture**
|
||||
- **FranceInter**
|
||||
- **francetv**
|
||||
@ -1251,7 +1252,6 @@ # Supported sites
|
||||
- **rtve.es:infantil**: RTVE infantil
|
||||
- **rtve.es:live**: RTVE.es live streams
|
||||
- **rtve.es:television**
|
||||
- **RTVS**
|
||||
- **rtvslo.si**
|
||||
- **rtvslo.si:show**
|
||||
- **RudoVideo**
|
||||
@ -1407,6 +1407,7 @@ # Supported sites
|
||||
- **StretchInternet**
|
||||
- **Stripchat**
|
||||
- **stv:player**
|
||||
- **stvr**: Slovak Television and Radio (formerly RTVS)
|
||||
- **Subsplash**
|
||||
- **subsplash:playlist**
|
||||
- **Substack**
|
||||
|
@ -659,6 +659,8 @@ def test_url_or_none(self):
|
||||
self.assertEqual(url_or_none('mms://foo.de'), 'mms://foo.de')
|
||||
self.assertEqual(url_or_none('rtspu://foo.de'), 'rtspu://foo.de')
|
||||
self.assertEqual(url_or_none('ftps://foo.de'), 'ftps://foo.de')
|
||||
self.assertEqual(url_or_none('ws://foo.de'), 'ws://foo.de')
|
||||
self.assertEqual(url_or_none('wss://foo.de'), 'wss://foo.de')
|
||||
|
||||
def test_parse_age_limit(self):
|
||||
self.assertEqual(parse_age_limit(None), None)
|
||||
|
@ -85,6 +85,7 @@ def communicate_ws(reconnect):
|
||||
'quality': live_quality,
|
||||
'protocol': 'hls+fmp4',
|
||||
'latency': live_latency,
|
||||
'accessRightMethod': 'single_cookie',
|
||||
'chasePlay': False,
|
||||
},
|
||||
'room': {
|
||||
|
@ -683,6 +683,7 @@
|
||||
)
|
||||
from .foxsports import FoxSportsIE
|
||||
from .fptplay import FptplayIE
|
||||
from .francaisfacile import FrancaisFacileIE
|
||||
from .franceinter import FranceInterIE
|
||||
from .francetv import (
|
||||
FranceTVIE,
|
||||
@ -902,6 +903,7 @@
|
||||
IviIE,
|
||||
)
|
||||
from .ivideon import IvideonIE
|
||||
from .ivoox import IvooxIE
|
||||
from .iwara import (
|
||||
IwaraIE,
|
||||
IwaraPlaylistIE,
|
||||
@ -959,7 +961,10 @@
|
||||
)
|
||||
from .kicker import KickerIE
|
||||
from .kickstarter import KickStarterIE
|
||||
from .kika import KikaIE
|
||||
from .kika import (
|
||||
KikaIE,
|
||||
KikaPlaylistIE,
|
||||
)
|
||||
from .kinja import KinjaEmbedIE
|
||||
from .kinopoisk import KinoPoiskIE
|
||||
from .kommunetv import KommunetvIE
|
||||
@ -1060,6 +1065,7 @@
|
||||
from .lovehomeporn import LoveHomePornIE
|
||||
from .lrt import (
|
||||
LRTVODIE,
|
||||
LRTRadioIE,
|
||||
LRTStreamIE,
|
||||
)
|
||||
from .lsm import (
|
||||
@ -1492,6 +1498,10 @@
|
||||
)
|
||||
from .parler import ParlerIE
|
||||
from .parlview import ParlviewIE
|
||||
from .parti import (
|
||||
PartiLivestreamIE,
|
||||
PartiVideoIE,
|
||||
)
|
||||
from .patreon import (
|
||||
PatreonCampaignIE,
|
||||
PatreonIE,
|
||||
@ -1738,6 +1748,7 @@
|
||||
RoosterTeethSeriesIE,
|
||||
)
|
||||
from .rottentomatoes import RottenTomatoesIE
|
||||
from .roya import RoyaLiveIE
|
||||
from .rozhlas import (
|
||||
MujRozhlasIE,
|
||||
RozhlasIE,
|
||||
|
@ -146,7 +146,7 @@ class TokFMPodcastIE(InfoExtractor):
|
||||
'url': 'https://audycje.tokfm.pl/podcast/91275,-Systemowy-rasizm-Czy-zamieszki-w-USA-po-morderstwie-w-Minneapolis-doprowadza-do-zmian-w-sluzbach-panstwowych',
|
||||
'info_dict': {
|
||||
'id': '91275',
|
||||
'ext': 'aac',
|
||||
'ext': 'mp3',
|
||||
'title': 'md5:a9b15488009065556900169fb8061cce',
|
||||
'episode': 'md5:a9b15488009065556900169fb8061cce',
|
||||
'series': 'Analizy',
|
||||
@ -164,23 +164,20 @@ def _real_extract(self, url):
|
||||
raise ExtractorError('No such podcast', expected=True)
|
||||
metadata = metadata[0]
|
||||
|
||||
formats = []
|
||||
for ext in ('aac', 'mp3'):
|
||||
url_data = self._download_json(
|
||||
f'https://api.podcast.radioagora.pl/api4/getSongUrl?podcast_id={media_id}&device_id={uuid.uuid4()}&ppre=false&audio={ext}',
|
||||
media_id, f'Downloading podcast {ext} URL')
|
||||
# prevents inserting the mp3 (default) multiple times
|
||||
if 'link_ssl' in url_data and f'.{ext}' in url_data['link_ssl']:
|
||||
formats.append({
|
||||
'url': url_data['link_ssl'],
|
||||
'ext': ext,
|
||||
'vcodec': 'none',
|
||||
'acodec': ext,
|
||||
})
|
||||
mp3_url = self._download_json(
|
||||
'https://api.podcast.radioagora.pl/api4/getSongUrl',
|
||||
media_id, 'Downloading podcast mp3 URL', query={
|
||||
'podcast_id': media_id,
|
||||
'device_id': str(uuid.uuid4()),
|
||||
'ppre': 'false',
|
||||
'audio': 'mp3',
|
||||
})['link_ssl']
|
||||
|
||||
return {
|
||||
'id': media_id,
|
||||
'formats': formats,
|
||||
'url': mp3_url,
|
||||
'vcodec': 'none',
|
||||
'ext': 'mp3',
|
||||
'title': metadata.get('podcast_name'),
|
||||
'series': metadata.get('series_name'),
|
||||
'episode': metadata.get('podcast_name'),
|
||||
|
@ -1570,6 +1570,8 @@ def _yield_json_ld(self, html, video_id, *, fatal=True, default=NO_DEFAULT):
|
||||
"""Yield all json ld objects in the html"""
|
||||
if default is not NO_DEFAULT:
|
||||
fatal = False
|
||||
if not fatal and not isinstance(html, str):
|
||||
return
|
||||
for mobj in re.finditer(JSON_LD_RE, html):
|
||||
json_ld_item = self._parse_json(
|
||||
mobj.group('json_ld'), video_id, fatal=fatal,
|
||||
|
@ -5,7 +5,9 @@
|
||||
int_or_none,
|
||||
try_get,
|
||||
unified_strdate,
|
||||
url_or_none,
|
||||
)
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class CrowdBunkerIE(InfoExtractor):
|
||||
@ -44,16 +46,15 @@ def _real_extract(self, url):
|
||||
'url': sub_url,
|
||||
})
|
||||
|
||||
mpd_url = try_get(video_json, lambda x: x['dashManifest']['url'])
|
||||
if mpd_url:
|
||||
fmts, subs = self._extract_mpd_formats_and_subtitles(mpd_url, video_id)
|
||||
if mpd_url := traverse_obj(video_json, ('dashManifest', 'url', {url_or_none})):
|
||||
fmts, subs = self._extract_mpd_formats_and_subtitles(mpd_url, video_id, mpd_id='dash', fatal=False)
|
||||
formats.extend(fmts)
|
||||
subtitles = self._merge_subtitles(subtitles, subs)
|
||||
m3u8_url = try_get(video_json, lambda x: x['hlsManifest']['url'])
|
||||
if m3u8_url:
|
||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(mpd_url, video_id)
|
||||
self._merge_subtitles(subs, target=subtitles)
|
||||
|
||||
if m3u8_url := traverse_obj(video_json, ('hlsManifest', 'url', {url_or_none})):
|
||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(m3u8_url, video_id, m3u8_id='hls', fatal=False)
|
||||
formats.extend(fmts)
|
||||
subtitles = self._merge_subtitles(subtitles, subs)
|
||||
self._merge_subtitles(subs, target=subtitles)
|
||||
|
||||
thumbnails = [{
|
||||
'url': image['url'],
|
||||
|
87
yt_dlp/extractor/francaisfacile.py
Normal file
87
yt_dlp/extractor/francaisfacile.py
Normal file
@ -0,0 +1,87 @@
|
||||
import urllib.parse
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..networking.exceptions import HTTPError
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
url_or_none,
|
||||
)
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class FrancaisFacileIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://francaisfacile\.rfi\.fr/[a-z]{2}/(?:actualit%C3%A9|podcasts/[^/#?]+)/(?P<id>[^/#?]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://francaisfacile.rfi.fr/fr/actualit%C3%A9/20250305-r%C3%A9concilier-les-jeunes-avec-la-lecture-gr%C3%A2ce-aux-r%C3%A9seaux-sociaux',
|
||||
'md5': '4f33674cb205744345cc835991100afa',
|
||||
'info_dict': {
|
||||
'id': 'WBMZ58952-FLE-FR-20250305',
|
||||
'display_id': '20250305-réconcilier-les-jeunes-avec-la-lecture-grâce-aux-réseaux-sociaux',
|
||||
'title': 'Réconcilier les jeunes avec la lecture grâce aux réseaux sociaux',
|
||||
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/05/6b6af52a-f9ba-11ef-a1f8-005056a97652.mp3',
|
||||
'ext': 'mp3',
|
||||
'description': 'md5:b903c63d8585bd59e8cc4d5f80c4272d',
|
||||
'duration': 103.15,
|
||||
'timestamp': 1741177984,
|
||||
'upload_date': '20250305',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://francaisfacile.rfi.fr/fr/actualit%C3%A9/20250307-argentine-le-sac-d-un-alpiniste-retrouv%C3%A9-40-ans-apr%C3%A8s-sa-mort',
|
||||
'md5': 'b8c3a63652d4ae8e8092dda5700c1cd9',
|
||||
'info_dict': {
|
||||
'id': 'WBMZ59102-FLE-FR-20250307',
|
||||
'display_id': '20250307-argentine-le-sac-d-un-alpiniste-retrouvé-40-ans-après-sa-mort',
|
||||
'title': 'Argentine: le sac d\'un alpiniste retrouvé 40 ans après sa mort',
|
||||
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/07/8edf4082-fb46-11ef-8a37-005056bf762b.mp3',
|
||||
'ext': 'mp3',
|
||||
'description': 'md5:7fd088fbdf4a943bb68cf82462160dca',
|
||||
'duration': 117.74,
|
||||
'timestamp': 1741352789,
|
||||
'upload_date': '20250307',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://francaisfacile.rfi.fr/fr/podcasts/un-mot-une-histoire/20250317-le-mot-de-david-foenkinos-peut-%C3%AAtre',
|
||||
'md5': 'db83c2cc2589b4c24571c6b6cf14f5f1',
|
||||
'info_dict': {
|
||||
'id': 'WBMZ59441-FLE-FR-20250317',
|
||||
'display_id': '20250317-le-mot-de-david-foenkinos-peut-être',
|
||||
'title': 'Le mot de David Foenkinos: «peut-être» - Un mot, une histoire',
|
||||
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/17/4ca6cbbe-0315-11f0-a85b-005056a97652.mp3',
|
||||
'ext': 'mp3',
|
||||
'description': 'md5:3fe35fae035803df696bfa7af2496e49',
|
||||
'duration': 198.96,
|
||||
'timestamp': 1742210897,
|
||||
'upload_date': '20250317',
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = urllib.parse.unquote(self._match_id(url))
|
||||
|
||||
try: # yt-dlp's default user-agents are too old and blocked by the site
|
||||
webpage = self._download_webpage(url, display_id, headers={
|
||||
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:136.0) Gecko/20100101 Firefox/136.0',
|
||||
})
|
||||
except ExtractorError as e:
|
||||
if not isinstance(e.cause, HTTPError) or e.cause.status != 403:
|
||||
raise
|
||||
# Retry with impersonation if hardcoded UA is insufficient
|
||||
webpage = self._download_webpage(url, display_id, impersonate=True)
|
||||
|
||||
data = self._search_json(
|
||||
r'<script[^>]+\bdata-media-id=[^>]+\btype="application/json"[^>]*>',
|
||||
webpage, 'audio data', display_id)
|
||||
|
||||
return {
|
||||
'id': data['mediaId'],
|
||||
'display_id': display_id,
|
||||
'vcodec': 'none',
|
||||
'title': self._html_extract_title(webpage),
|
||||
**self._search_json_ld(webpage, display_id, fatal=False),
|
||||
**traverse_obj(data, {
|
||||
'title': ('title', {str}),
|
||||
'url': ('sources', ..., 'url', {url_or_none}, any),
|
||||
'duration': ('sources', ..., 'duration', {float_or_none}, any),
|
||||
}),
|
||||
}
|
@ -2214,10 +2214,21 @@ def hex_or_none(value):
|
||||
if is_live is not None:
|
||||
info['live_status'] = 'not_live' if is_live == 'false' else 'is_live'
|
||||
return
|
||||
headers = m3u8_format.get('http_headers') or info.get('http_headers')
|
||||
duration = self._extract_m3u8_vod_duration(
|
||||
m3u8_format['url'], info.get('id'), note='Checking m3u8 live status',
|
||||
errnote='Failed to download m3u8 media playlist', headers=headers)
|
||||
headers = m3u8_format.get('http_headers') or info.get('http_headers') or {}
|
||||
display_id = info.get('id')
|
||||
urlh = self._request_webpage(
|
||||
m3u8_format['url'], display_id, 'Checking m3u8 live status', errnote=False,
|
||||
headers={**headers, 'Accept-Encoding': 'identity'}, fatal=False)
|
||||
if urlh is False:
|
||||
return
|
||||
first_bytes = urlh.read(512)
|
||||
if not first_bytes.startswith(b'#EXTM3U'):
|
||||
return
|
||||
m3u8_doc = self._webpage_read_content(
|
||||
urlh, urlh.url, display_id, prefix=first_bytes, fatal=False, errnote=False)
|
||||
if not m3u8_doc:
|
||||
return
|
||||
duration = self._parse_m3u8_vod_duration(m3u8_doc, display_id)
|
||||
if not duration:
|
||||
info['live_status'] = 'is_live'
|
||||
info['duration'] = info.get('duration') or duration
|
||||
|
78
yt_dlp/extractor/ivoox.py
Normal file
78
yt_dlp/extractor/ivoox.py
Normal file
@ -0,0 +1,78 @@
|
||||
from .common import InfoExtractor
|
||||
from ..utils import int_or_none, parse_iso8601, url_or_none, urljoin
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class IvooxIE(InfoExtractor):
|
||||
_VALID_URL = (
|
||||
r'https?://(?:www\.)?ivoox\.com/(?:\w{2}/)?[^/?#]+_rf_(?P<id>[0-9]+)_1\.html',
|
||||
r'https?://go\.ivoox\.com/rf/(?P<id>[0-9]+)',
|
||||
)
|
||||
_TESTS = [{
|
||||
'url': 'https://www.ivoox.com/dex-08x30-rostros-del-mal-los-asesinos-en-audios-mp3_rf_143594959_1.html',
|
||||
'md5': '993f712de5b7d552459fc66aa3726885',
|
||||
'info_dict': {
|
||||
'id': '143594959',
|
||||
'ext': 'mp3',
|
||||
'timestamp': 1742731200,
|
||||
'channel': 'DIAS EXTRAÑOS con Santiago Camacho',
|
||||
'title': 'DEx 08x30 Rostros del mal: Los asesinos en serie que aterrorizaron España',
|
||||
'description': 'md5:eae8b4b9740d0216d3871390b056bb08',
|
||||
'uploader': 'Santiago Camacho',
|
||||
'thumbnail': 'https://static-1.ivoox.com/audios/c/d/5/2/cd52f46783fe735000c33a803dce2554_XXL.jpg',
|
||||
'upload_date': '20250323',
|
||||
'episode': 'DEx 08x30 Rostros del mal: Los asesinos en serie que aterrorizaron España',
|
||||
'duration': 11837,
|
||||
'tags': ['españa', 'asesinos en serie', 'arropiero', 'historia criminal', 'mataviejas'],
|
||||
},
|
||||
}, {
|
||||
'url': 'https://go.ivoox.com/rf/143594959',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.ivoox.com/en/campodelgas-28-03-2025-audios-mp3_rf_144036942_1.html',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
media_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, media_id, fatal=False)
|
||||
|
||||
data = self._search_nuxt_data(
|
||||
webpage, media_id, fatal=False, traverse=('data', 0, 'data', 'audio'))
|
||||
|
||||
direct_download = self._download_json(
|
||||
f'https://vcore-web.ivoox.com/v1/public/audios/{media_id}/download-url', media_id, fatal=False,
|
||||
note='Fetching direct download link', headers={'Referer': url})
|
||||
|
||||
download_paths = {
|
||||
*traverse_obj(direct_download, ('data', 'downloadUrl', {str}, filter, all)),
|
||||
*traverse_obj(data, (('downloadUrl', 'mediaUrl'), {str}, filter)),
|
||||
}
|
||||
|
||||
formats = []
|
||||
for path in download_paths:
|
||||
formats.append({
|
||||
'url': urljoin('https://ivoox.com', path),
|
||||
'http_headers': {'Referer': url},
|
||||
})
|
||||
|
||||
return {
|
||||
'id': media_id,
|
||||
'formats': formats,
|
||||
'uploader': self._html_search_regex(r'data-prm-author="([^"]+)"', webpage, 'author', default=None),
|
||||
'timestamp': parse_iso8601(
|
||||
self._html_search_regex(r'data-prm-pubdate="([^"]+)"', webpage, 'timestamp', default=None)),
|
||||
'channel': self._html_search_regex(r'data-prm-podname="([^"]+)"', webpage, 'channel', default=None),
|
||||
'title': self._html_search_regex(r'data-prm-title="([^"]+)"', webpage, 'title', default=None),
|
||||
'thumbnail': self._og_search_thumbnail(webpage, default=None),
|
||||
'description': self._og_search_description(webpage, default=None),
|
||||
**self._search_json_ld(webpage, media_id, default={}),
|
||||
**traverse_obj(data, {
|
||||
'title': ('title', {str}),
|
||||
'description': ('description', {str}),
|
||||
'thumbnail': ('image', {url_or_none}),
|
||||
'timestamp': ('uploadDate', {parse_iso8601(delimiter=' ')}),
|
||||
'duration': ('duration', {int_or_none}),
|
||||
'tags': ('tags', ..., 'name', {str}),
|
||||
}),
|
||||
}
|
@ -1,3 +1,5 @@
|
||||
import itertools
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
@ -124,3 +126,43 @@ def _extract_formats(self, media_info, video_id):
|
||||
'vbr': ('bitrateVideo', {int_or_none}, {lambda x: None if x == -1 else x}),
|
||||
}),
|
||||
}
|
||||
|
||||
|
||||
class KikaPlaylistIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?kika\.de/[\w-]+/(?P<id>[a-z-]+\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'https://www.kika.de/logo/logo-die-welt-und-ich-562',
|
||||
'info_dict': {
|
||||
'id': 'logo-die-welt-und-ich-562',
|
||||
'title': 'logo!',
|
||||
'description': 'md5:7b9d7f65561b82fa512f2cfb553c397d',
|
||||
},
|
||||
'playlist_count': 100,
|
||||
}]
|
||||
|
||||
def _entries(self, playlist_url, playlist_id):
|
||||
for page in itertools.count(1):
|
||||
data = self._download_json(playlist_url, playlist_id, note=f'Downloading page {page}')
|
||||
for item in traverse_obj(data, ('content', lambda _, v: url_or_none(v['api']['url']))):
|
||||
yield self.url_result(
|
||||
item['api']['url'], ie=KikaIE,
|
||||
**traverse_obj(item, {
|
||||
'id': ('id', {str}),
|
||||
'title': ('title', {str}),
|
||||
'duration': ('duration', {int_or_none}),
|
||||
'timestamp': ('date', {parse_iso8601}),
|
||||
}))
|
||||
|
||||
playlist_url = traverse_obj(data, ('links', 'next', {url_or_none}))
|
||||
if not playlist_url:
|
||||
break
|
||||
|
||||
def _real_extract(self, url):
|
||||
playlist_id = self._match_id(url)
|
||||
brand_data = self._download_json(
|
||||
f'https://www.kika.de/_next-api/proxy/v1/brands/{playlist_id}', playlist_id)
|
||||
|
||||
return self.playlist_result(
|
||||
self._entries(brand_data['videoSubchannel']['videosPageUrl'], playlist_id),
|
||||
playlist_id, title=brand_data.get('title'), description=brand_data.get('description'))
|
||||
|
@ -2,8 +2,11 @@
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
merge_dicts,
|
||||
str_or_none,
|
||||
traverse_obj,
|
||||
unified_timestamp,
|
||||
url_or_none,
|
||||
urljoin,
|
||||
)
|
||||
|
||||
|
||||
@ -80,7 +83,7 @@ class LRTVODIE(LRTBaseIE):
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
path, video_id = self._match_valid_url(url).groups()
|
||||
path, video_id = self._match_valid_url(url).group('path', 'id')
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
media_url = self._extract_js_var(webpage, 'main_url', path)
|
||||
@ -106,3 +109,42 @@ def _real_extract(self, url):
|
||||
}
|
||||
|
||||
return merge_dicts(clean_info, jw_data, json_ld_data)
|
||||
|
||||
|
||||
class LRTRadioIE(LRTBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?lrt\.lt/radioteka/irasas/(?P<id>\d+)/(?P<path>[^?#/]+)'
|
||||
_TESTS = [{
|
||||
# m3u8 download
|
||||
'url': 'https://www.lrt.lt/radioteka/irasas/2000359728/nemarios-eiles-apie-pragarus-ir-skaistyklas-su-aiste-kiltinaviciute',
|
||||
'info_dict': {
|
||||
'id': '2000359728',
|
||||
'ext': 'm4a',
|
||||
'title': 'Nemarios eilės: apie pragarus ir skaistyklas su Aiste Kiltinavičiūte',
|
||||
'description': 'md5:5eee9a0e86a55bf547bd67596204625d',
|
||||
'timestamp': 1726143120,
|
||||
'upload_date': '20240912',
|
||||
'tags': 'count:5',
|
||||
'thumbnail': r're:https?://.+/.+\.jpe?g',
|
||||
'categories': ['Daiktiniai įrodymai'],
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.lrt.lt/radioteka/irasas/2000304654/vakaras-su-knyga-svetlana-aleksijevic-cernobylio-malda-v-dalis?season=%2Fmediateka%2Faudio%2Fvakaras-su-knyga%2F2023',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id, path = self._match_valid_url(url).group('id', 'path')
|
||||
media = self._download_json(
|
||||
'https://www.lrt.lt/radioteka/api/media', video_id,
|
||||
query={'url': f'/mediateka/irasas/{video_id}/{path}'})
|
||||
|
||||
return traverse_obj(media, {
|
||||
'id': ('id', {int}, {str_or_none}),
|
||||
'title': ('title', {str}),
|
||||
'tags': ('tags', ..., 'name', {str}),
|
||||
'categories': ('playlist_item', 'category', {str}, filter, all, filter),
|
||||
'description': ('content', {clean_html}, {str}),
|
||||
'timestamp': ('date', {lambda x: x.replace('.', '/')}, {unified_timestamp}),
|
||||
'thumbnail': ('playlist_item', 'image', {urljoin('https://www.lrt.lt')}),
|
||||
'formats': ('playlist_item', 'file', {lambda x: self._extract_m3u8_formats(x, video_id)}),
|
||||
})
|
||||
|
@ -4,6 +4,7 @@
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
parse_resolution,
|
||||
traverse_obj,
|
||||
unified_timestamp,
|
||||
url_basename,
|
||||
@ -83,8 +84,8 @@ def _sub_to_dict(subtitle_list):
|
||||
subtitles.setdefault(sub.pop('tag', 'und'), []).append(sub)
|
||||
return subtitles
|
||||
|
||||
def _extract_ism(self, ism_url, video_id):
|
||||
formats = self._extract_ism_formats(ism_url, video_id)
|
||||
def _extract_ism(self, ism_url, video_id, fatal=True):
|
||||
formats = self._extract_ism_formats(ism_url, video_id, fatal=fatal)
|
||||
for fmt in formats:
|
||||
if fmt['language'] != 'eng' and 'English' not in fmt['format_id']:
|
||||
fmt['language_preference'] = -10
|
||||
@ -218,9 +219,21 @@ class MicrosoftLearnEpisodeIE(MicrosoftMediusBaseIE):
|
||||
'description': 'md5:7bbbfb593d21c2cf2babc3715ade6b88',
|
||||
'timestamp': 1676339547,
|
||||
'upload_date': '20230214',
|
||||
'thumbnail': r're:https://learn\.microsoft\.com/video/media/.*\.png',
|
||||
'thumbnail': r're:https://learn\.microsoft\.com/video/media/.+\.png',
|
||||
'subtitles': 'count:14',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://learn.microsoft.com/en-gb/shows/on-demand-instructor-led-training-series/az-900-module-1',
|
||||
'info_dict': {
|
||||
'id': '4fe10f7c-d83c-463b-ac0e-c30a8195e01b',
|
||||
'ext': 'mp4',
|
||||
'title': 'AZ-900 Cloud fundamentals (1 of 6)',
|
||||
'description': 'md5:3c2212ce865e9142f402c766441bd5c9',
|
||||
'thumbnail': r're:https://.+/.+\.jpg',
|
||||
'timestamp': 1706605184,
|
||||
'upload_date': '20240130',
|
||||
},
|
||||
'params': {'format': 'bv[protocol=https]'},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@ -230,9 +243,32 @@ def _real_extract(self, url):
|
||||
entry_id = self._html_search_meta('entryId', webpage, 'entryId', fatal=True)
|
||||
video_info = self._download_json(
|
||||
f'https://learn.microsoft.com/api/video/public/v1/entries/{entry_id}', video_id)
|
||||
|
||||
formats = []
|
||||
if ism_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoUrl', {url_or_none})):
|
||||
formats.extend(self._extract_ism(ism_url, video_id, fatal=False))
|
||||
if hls_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoHLSUrl', {url_or_none})):
|
||||
formats.extend(self._extract_m3u8_formats(hls_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
|
||||
if mpd_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoDashUrl', {url_or_none})):
|
||||
formats.extend(self._extract_mpd_formats(mpd_url, video_id, mpd_id='dash', fatal=False))
|
||||
for key in ('low', 'medium', 'high'):
|
||||
if video_url := traverse_obj(video_info, ('publicVideo', f'{key}QualityVideoUrl', {url_or_none})):
|
||||
formats.append({
|
||||
'url': video_url,
|
||||
'format_id': f'video-http-{key}',
|
||||
'acodec': 'none',
|
||||
**parse_resolution(video_url),
|
||||
})
|
||||
if audio_url := traverse_obj(video_info, ('publicVideo', 'audioUrl', {url_or_none})):
|
||||
formats.append({
|
||||
'url': audio_url,
|
||||
'format_id': 'audio-http',
|
||||
'vcodec': 'none',
|
||||
})
|
||||
|
||||
return {
|
||||
'id': entry_id,
|
||||
'formats': self._extract_ism(video_info['publicVideo']['adaptiveVideoUrl'], video_id),
|
||||
'formats': formats,
|
||||
'subtitles': self._sub_to_dict(traverse_obj(video_info, (
|
||||
'publicVideo', 'captions', lambda _, v: url_or_none(v['url']), {
|
||||
'tag': ('language', {str}),
|
||||
|
@ -10,7 +10,9 @@
|
||||
parse_iso8601,
|
||||
strip_or_none,
|
||||
try_get,
|
||||
url_or_none,
|
||||
)
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class MixcloudBaseIE(InfoExtractor):
|
||||
@ -37,7 +39,7 @@ class MixcloudIE(MixcloudBaseIE):
|
||||
'ext': 'm4a',
|
||||
'title': 'Cryptkeeper',
|
||||
'description': 'After quite a long silence from myself, finally another Drum\'n\'Bass mix with my favourite current dance floor bangers.',
|
||||
'uploader': 'Daniel Holbach',
|
||||
'uploader': 'dholbach',
|
||||
'uploader_id': 'dholbach',
|
||||
'thumbnail': r're:https?://.*\.jpg',
|
||||
'view_count': int,
|
||||
@ -46,10 +48,11 @@ class MixcloudIE(MixcloudBaseIE):
|
||||
'uploader_url': 'https://www.mixcloud.com/dholbach/',
|
||||
'artist': 'Submorphics & Chino , Telekinesis, Porter Robinson, Enei, Breakage ft Jess Mills',
|
||||
'duration': 3723,
|
||||
'tags': [],
|
||||
'tags': ['liquid drum and bass', 'drum and bass'],
|
||||
'comment_count': int,
|
||||
'repost_count': int,
|
||||
'like_count': int,
|
||||
'artists': list,
|
||||
},
|
||||
'params': {'skip_download': 'm3u8'},
|
||||
}, {
|
||||
@ -67,7 +70,7 @@ class MixcloudIE(MixcloudBaseIE):
|
||||
'upload_date': '20150203',
|
||||
'uploader_url': 'https://www.mixcloud.com/gillespeterson/',
|
||||
'duration': 2992,
|
||||
'tags': [],
|
||||
'tags': ['jazz', 'soul', 'world music', 'funk'],
|
||||
'comment_count': int,
|
||||
'repost_count': int,
|
||||
'like_count': int,
|
||||
@ -149,8 +152,6 @@ def _real_extract(self, url):
|
||||
elif reason:
|
||||
raise ExtractorError('Track is restricted', expected=True)
|
||||
|
||||
title = cloudcast['name']
|
||||
|
||||
stream_info = cloudcast['streamInfo']
|
||||
formats = []
|
||||
|
||||
@ -182,47 +183,39 @@ def _real_extract(self, url):
|
||||
self.raise_login_required(metadata_available=True)
|
||||
|
||||
comments = []
|
||||
for edge in (try_get(cloudcast, lambda x: x['comments']['edges']) or []):
|
||||
node = edge.get('node') or {}
|
||||
for node in traverse_obj(cloudcast, ('comments', 'edges', ..., 'node', {dict})):
|
||||
text = strip_or_none(node.get('comment'))
|
||||
if not text:
|
||||
continue
|
||||
user = node.get('user') or {}
|
||||
comments.append({
|
||||
'author': user.get('displayName'),
|
||||
'author_id': user.get('username'),
|
||||
'text': text,
|
||||
'timestamp': parse_iso8601(node.get('created')),
|
||||
**traverse_obj(node, {
|
||||
'author': ('user', 'displayName', {str}),
|
||||
'author_id': ('user', 'username', {str}),
|
||||
'timestamp': ('created', {parse_iso8601}),
|
||||
}),
|
||||
})
|
||||
|
||||
tags = []
|
||||
for t in cloudcast.get('tags'):
|
||||
tag = try_get(t, lambda x: x['tag']['name'], str)
|
||||
if not tag:
|
||||
tags.append(tag)
|
||||
|
||||
get_count = lambda x: int_or_none(try_get(cloudcast, lambda y: y[x]['totalCount']))
|
||||
|
||||
owner = cloudcast.get('owner') or {}
|
||||
|
||||
return {
|
||||
'id': track_id,
|
||||
'title': title,
|
||||
'formats': formats,
|
||||
'description': cloudcast.get('description'),
|
||||
'thumbnail': try_get(cloudcast, lambda x: x['picture']['url'], str),
|
||||
'uploader': owner.get('displayName'),
|
||||
'timestamp': parse_iso8601(cloudcast.get('publishDate')),
|
||||
'uploader_id': owner.get('username'),
|
||||
'uploader_url': owner.get('url'),
|
||||
'duration': int_or_none(cloudcast.get('audioLength')),
|
||||
'view_count': int_or_none(cloudcast.get('plays')),
|
||||
'like_count': get_count('favorites'),
|
||||
'repost_count': get_count('reposts'),
|
||||
'comment_count': get_count('comments'),
|
||||
'comments': comments,
|
||||
'tags': tags,
|
||||
'artist': ', '.join(cloudcast.get('featuringArtistList') or []) or None,
|
||||
**traverse_obj(cloudcast, {
|
||||
'title': ('name', {str}),
|
||||
'description': ('description', {str}),
|
||||
'thumbnail': ('picture', 'url', {url_or_none}),
|
||||
'timestamp': ('publishDate', {parse_iso8601}),
|
||||
'duration': ('audioLength', {int_or_none}),
|
||||
'uploader': ('owner', 'displayName', {str}),
|
||||
'uploader_id': ('owner', 'username', {str}),
|
||||
'uploader_url': ('owner', 'url', {url_or_none}),
|
||||
'view_count': ('plays', {int_or_none}),
|
||||
'like_count': ('favorites', 'totalCount', {int_or_none}),
|
||||
'repost_count': ('reposts', 'totalCount', {int_or_none}),
|
||||
'comment_count': ('comments', 'totalCount', {int_or_none}),
|
||||
'tags': ('tags', ..., 'tag', 'name', {str}, filter, all, filter),
|
||||
'artists': ('featuringArtistList', ..., {str}, filter, all, filter),
|
||||
}),
|
||||
}
|
||||
|
||||
|
||||
@ -295,7 +288,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
|
||||
'url': 'http://www.mixcloud.com/dholbach/',
|
||||
'info_dict': {
|
||||
'id': 'dholbach_uploads',
|
||||
'title': 'Daniel Holbach (uploads)',
|
||||
'title': 'dholbach (uploads)',
|
||||
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
|
||||
},
|
||||
'playlist_mincount': 36,
|
||||
@ -303,7 +296,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
|
||||
'url': 'http://www.mixcloud.com/dholbach/uploads/',
|
||||
'info_dict': {
|
||||
'id': 'dholbach_uploads',
|
||||
'title': 'Daniel Holbach (uploads)',
|
||||
'title': 'dholbach (uploads)',
|
||||
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
|
||||
},
|
||||
'playlist_mincount': 36,
|
||||
@ -311,7 +304,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
|
||||
'url': 'http://www.mixcloud.com/dholbach/favorites/',
|
||||
'info_dict': {
|
||||
'id': 'dholbach_favorites',
|
||||
'title': 'Daniel Holbach (favorites)',
|
||||
'title': 'dholbach (favorites)',
|
||||
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
|
||||
},
|
||||
# 'params': {
|
||||
@ -337,7 +330,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
|
||||
'title': 'First Ear (stream)',
|
||||
'description': 'we maraud for ears',
|
||||
},
|
||||
'playlist_mincount': 269,
|
||||
'playlist_mincount': 267,
|
||||
}]
|
||||
|
||||
_TITLE_KEY = 'displayName'
|
||||
@ -361,7 +354,7 @@ class MixcloudPlaylistIE(MixcloudPlaylistBaseIE):
|
||||
'id': 'maxvibes_jazzcat-on-ness-radio',
|
||||
'title': 'Ness Radio sessions',
|
||||
},
|
||||
'playlist_mincount': 59,
|
||||
'playlist_mincount': 58,
|
||||
}]
|
||||
_TITLE_KEY = 'name'
|
||||
_DESCRIPTION_KEY = 'description'
|
||||
|
@ -449,9 +449,7 @@ def _extract_formats_and_subtitles(self, broadcast, video_id):
|
||||
|
||||
if not (m3u8_url and token):
|
||||
errors = '; '.join(traverse_obj(response, ('errors', ..., 'message', {str})))
|
||||
if 'not entitled' in errors:
|
||||
raise ExtractorError(errors, expected=True)
|
||||
elif errors: # Only warn when 'blacked out' since radio formats are available
|
||||
if errors: # Only warn when 'blacked out' or 'not entitled'; radio formats may be available
|
||||
self.report_warning(f'API returned errors for {format_id}: {errors}')
|
||||
else:
|
||||
self.report_warning(f'No formats available for {format_id} broadcast; skipping')
|
||||
|
@ -27,6 +27,7 @@
|
||||
traverse_obj,
|
||||
try_get,
|
||||
unescapeHTML,
|
||||
unified_timestamp,
|
||||
update_url_query,
|
||||
url_basename,
|
||||
url_or_none,
|
||||
@ -985,6 +986,7 @@ def _real_extract(self, url):
|
||||
'quality': 'abr',
|
||||
'protocol': 'hls+fmp4',
|
||||
'latency': latency,
|
||||
'accessRightMethod': 'single_cookie',
|
||||
'chasePlay': False,
|
||||
},
|
||||
'room': {
|
||||
@ -1005,6 +1007,7 @@ def _real_extract(self, url):
|
||||
if data.get('type') == 'stream':
|
||||
m3u8_url = data['data']['uri']
|
||||
qualities = data['data']['availableQualities']
|
||||
cookies = data['data']['cookies']
|
||||
break
|
||||
elif data.get('type') == 'disconnect':
|
||||
self.write_debug(recv)
|
||||
@ -1043,6 +1046,11 @@ def _real_extract(self, url):
|
||||
**res,
|
||||
})
|
||||
|
||||
for cookie in cookies:
|
||||
self._set_cookie(
|
||||
cookie['domain'], cookie['name'], cookie['value'],
|
||||
expire_time=unified_timestamp(cookie['expires']), path=cookie['path'], secure=cookie['secure'])
|
||||
|
||||
formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4', live=True)
|
||||
for fmt, q in zip(formats, reversed(qualities[1:])):
|
||||
fmt.update({
|
||||
|
@ -11,12 +11,15 @@ class On24IE(InfoExtractor):
|
||||
IE_NAME = 'on24'
|
||||
IE_DESC = 'ON24'
|
||||
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://event\.on24\.com/(?:
|
||||
wcc/r/(?P<id_1>\d{7})/(?P<key_1>[0-9A-F]{32})|
|
||||
eventRegistration/(?:console/EventConsoleApollo|EventLobbyServlet\?target=lobby30)
|
||||
\.jsp\?(?:[^/#?]*&)?eventid=(?P<id_2>\d{7})[^/#?]*&key=(?P<key_2>[0-9A-F]{32})
|
||||
)'''
|
||||
_ID_RE = r'(?P<id>\d{7})'
|
||||
_KEY_RE = r'(?P<key>[0-9A-F]{32})'
|
||||
_URL_BASE_RE = r'https?://event\.on24\.com'
|
||||
_URL_QUERY_RE = rf'(?:[^#]*&)?eventid={_ID_RE}&(?:[^#]+&)?key={_KEY_RE}'
|
||||
_VALID_URL = [
|
||||
rf'{_URL_BASE_RE}/wcc/r/{_ID_RE}/{_KEY_RE}',
|
||||
rf'{_URL_BASE_RE}/eventRegistration/console/(?:EventConsoleApollo\.jsp|apollox/mainEvent/?)\?{_URL_QUERY_RE}',
|
||||
rf'{_URL_BASE_RE}/eventRegistration/EventLobbyServlet/?\?{_URL_QUERY_RE}',
|
||||
]
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?uimode=nextgeneration&eventid=2197467&sessionid=1&key=5DF57BE53237F36A43B478DD36277A84&contenttype=A&eventuserid=305999&playerwidth=1000&playerheight=650&caller=previewLobby&text_language_id=en&format=fhaudio&newConsole=false',
|
||||
@ -34,12 +37,16 @@ class On24IE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?&eventid=2639291&sessionid=1&username=&partnerref=&format=fhvideo1&mobile=&flashsupportedmobiledevice=&helpcenter=&key=82829018E813065A122363877975752E&newConsole=true&nxChe=true&newTabCon=true&text_language_id=en&playerwidth=748&playerheight=526&eventuserid=338788762&contenttype=A&mediametricsessionid=384764716&mediametricid=3558192&usercd=369267058&mode=launch',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://event.on24.com/eventRegistration/EventLobbyServlet?target=reg20.jsp&eventid=3543176&key=BC0F6B968B67C34B50D461D40FDB3E18&groupId=3143628',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://event.on24.com/eventRegistration/console/apollox/mainEvent?&eventid=4843671&sessionid=1&username=&partnerref=&format=fhvideo1&mobile=&flashsupportedmobiledevice=&helpcenter=&key=4EAC9B5C564CC98FF29E619B06A2F743&newConsole=true&nxChe=true&newTabCon=true&consoleEarEventConsole=false&consoleEarCloudApi=false&text_language_id=en&playerwidth=748&playerheight=526&referrer=https%3A%2F%2Fevent.on24.com%2Finterface%2Fregistration%2Fautoreg%2Findex.html%3Fsessionid%3D1%26eventid%3D4843671%26key%3D4EAC9B5C564CC98FF29E619B06A2F743%26email%3D000a3e42-7952-4dd6-8f8a-34c38ea3cf02%2540platform%26firstname%3Ds%26lastname%3Ds%26deletecookie%3Dtrue%26event_email%3DN%26marketing_email%3DN%26std1%3D0642572014177%26std2%3D0642572014179%26std3%3D550165f7-a44e-4725-9fe6-716f89908c2b%26std4%3D0&eventuserid=745776448&contenttype=A&mediametricsessionid=640613707&mediametricid=6810717&usercd=745776448&mode=launch',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = self._match_valid_url(url)
|
||||
event_id = mobj.group('id_1') or mobj.group('id_2')
|
||||
event_key = mobj.group('key_1') or mobj.group('key_2')
|
||||
event_id, event_key = self._match_valid_url(url).group('id', 'key')
|
||||
|
||||
event_data = self._download_json(
|
||||
'https://event.on24.com/apic/utilApp/EventConsoleCachedServlet',
|
||||
|
101
yt_dlp/extractor/parti.py
Normal file
101
yt_dlp/extractor/parti.py
Normal file
@ -0,0 +1,101 @@
|
||||
from .common import InfoExtractor
|
||||
from ..utils import UserNotLive, int_or_none, parse_iso8601, url_or_none, urljoin
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class PartiBaseIE(InfoExtractor):
|
||||
def _call_api(self, path, video_id, note=None):
|
||||
return self._download_json(
|
||||
f'https://api-backend.parti.com/parti_v2/profile/{path}', video_id, note)
|
||||
|
||||
|
||||
class PartiVideoIE(PartiBaseIE):
|
||||
IE_NAME = 'parti:video'
|
||||
_VALID_URL = r'https?://(?:www\.)?parti\.com/video/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://parti.com/video/66284',
|
||||
'info_dict': {
|
||||
'id': '66284',
|
||||
'ext': 'mp4',
|
||||
'title': 'NOW LIVE ',
|
||||
'upload_date': '20250327',
|
||||
'categories': ['Gaming'],
|
||||
'thumbnail': 'https://assets.parti.com/351424_eb9e5250-2821-484a-9c5f-ca99aa666c87.png',
|
||||
'channel': 'ItZTMGG',
|
||||
'timestamp': 1743044379,
|
||||
},
|
||||
'params': {'skip_download': 'm3u8'},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
data = self._call_api(f'get_livestream_channel_info/recent/{video_id}', video_id)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'formats': self._extract_m3u8_formats(
|
||||
urljoin('https://watch.parti.com', data['livestream_recording']), video_id, 'mp4'),
|
||||
**traverse_obj(data, {
|
||||
'title': ('event_title', {str}),
|
||||
'channel': ('user_name', {str}),
|
||||
'thumbnail': ('event_file', {url_or_none}),
|
||||
'categories': ('category_name', {str}, filter, all),
|
||||
'timestamp': ('event_start_ts', {int_or_none}),
|
||||
}),
|
||||
}
|
||||
|
||||
|
||||
class PartiLivestreamIE(PartiBaseIE):
|
||||
IE_NAME = 'parti:livestream'
|
||||
_VALID_URL = r'https?://(?:www\.)?parti\.com/creator/(?P<service>[\w]+)/(?P<id>[\w/-]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://parti.com/creator/parti/Capt_Robs_Adventures',
|
||||
'info_dict': {
|
||||
'id': 'Capt_Robs_Adventures',
|
||||
'ext': 'mp4',
|
||||
'title': r"re:I'm Live on Parti \d{4}-\d{2}-\d{2} \d{2}:\d{2}",
|
||||
'view_count': int,
|
||||
'thumbnail': r're:https://assets\.parti\.com/.+\.png',
|
||||
'timestamp': 1743879776,
|
||||
'upload_date': '20250405',
|
||||
'live_status': 'is_live',
|
||||
},
|
||||
'params': {'skip_download': 'm3u8'},
|
||||
}, {
|
||||
'url': 'https://parti.com/creator/discord/sazboxgaming/0',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
service, creator_slug = self._match_valid_url(url).group('service', 'id')
|
||||
|
||||
encoded_creator_slug = creator_slug.replace('/', '%23')
|
||||
creator_id = self._call_api(
|
||||
f'get_user_by_social_media/{service}/{encoded_creator_slug}',
|
||||
creator_slug, note='Fetching user ID')
|
||||
|
||||
data = self._call_api(
|
||||
f'get_livestream_channel_info/{creator_id}', creator_id,
|
||||
note='Fetching user profile feed')['channel_info']
|
||||
|
||||
if not traverse_obj(data, ('channel', 'is_live', {bool})):
|
||||
raise UserNotLive(video_id=creator_id)
|
||||
|
||||
channel_info = data['channel']
|
||||
|
||||
return {
|
||||
'id': creator_slug,
|
||||
'formats': self._extract_m3u8_formats(
|
||||
channel_info['playback_url'], creator_slug, live=True, query={
|
||||
'token': channel_info['playback_auth_token'],
|
||||
'player_version': '1.17.0',
|
||||
}),
|
||||
'is_live': True,
|
||||
**traverse_obj(data, {
|
||||
'title': ('livestream_event_info', 'event_name', {str}),
|
||||
'description': ('livestream_event_info', 'event_description', {str}),
|
||||
'thumbnail': ('livestream_event_info', 'livestream_preview_file', {url_or_none}),
|
||||
'timestamp': ('stream', 'start_time', {parse_iso8601}),
|
||||
'view_count': ('stream', 'viewer_count', {int_or_none}),
|
||||
}),
|
||||
}
|
43
yt_dlp/extractor/roya.py
Normal file
43
yt_dlp/extractor/roya.py
Normal file
@ -0,0 +1,43 @@
|
||||
from .common import InfoExtractor
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class RoyaLiveIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://roya\.tv/live-stream/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://roya.tv/live-stream/1',
|
||||
'info_dict': {
|
||||
'id': '1',
|
||||
'title': r're:Roya TV \d{4}-\d{2}-\d{2} \d{2}:\d{2}',
|
||||
'ext': 'mp4',
|
||||
'live_status': 'is_live',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://roya.tv/live-stream/21',
|
||||
'info_dict': {
|
||||
'id': '21',
|
||||
'title': r're:Roya News \d{4}-\d{2}-\d{2} \d{2}:\d{2}',
|
||||
'ext': 'mp4',
|
||||
'live_status': 'is_live',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://roya.tv/live-stream/10000',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
media_id = self._match_id(url)
|
||||
|
||||
stream_url = self._download_json(
|
||||
f'https://ticket.roya-tv.com/api/v5/fastchannel/{media_id}', media_id)['data']['secured_url']
|
||||
|
||||
title = traverse_obj(
|
||||
self._download_json('https://backend.roya.tv/api/v01/channels/schedule-pagination', media_id, fatal=False),
|
||||
('data', 0, 'channel', lambda _, v: str(v['id']) == media_id, 'title', {str}, any))
|
||||
|
||||
return {
|
||||
'id': media_id,
|
||||
'formats': self._extract_m3u8_formats(stream_url, media_id, 'mp4', m3u8_id='hls', live=True),
|
||||
'title': title,
|
||||
'is_live': True,
|
||||
}
|
@ -9,7 +9,9 @@
|
||||
|
||||
|
||||
class RTVSIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?rtvs\.sk/(?:radio|televizia)/archiv(?:/\d+)?/(?P<id>\d+)/?(?:[#?]|$)'
|
||||
IE_NAME = 'stvr'
|
||||
IE_DESC = 'Slovak Television and Radio (formerly RTVS)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:rtvs|stvr)\.sk/(?:radio|televizia)/archiv(?:/\d+)?/(?P<id>\d+)/?(?:[#?]|$)'
|
||||
_TESTS = [{
|
||||
# radio archive
|
||||
'url': 'http://www.rtvs.sk/radio/archiv/11224/414872',
|
||||
@ -19,7 +21,7 @@ class RTVSIE(InfoExtractor):
|
||||
'ext': 'mp3',
|
||||
'title': 'Ostrov pokladov 1 časť.mp3',
|
||||
'duration': 2854,
|
||||
'thumbnail': 'https://www.rtvs.sk/media/a501/image/file/2/0000/b1R8.rtvs.jpg',
|
||||
'thumbnail': 'https://www.stvr.sk/media/a501/image/file/2/0000/rtvs-00009383.png',
|
||||
'display_id': '135331',
|
||||
},
|
||||
}, {
|
||||
@ -30,7 +32,7 @@ class RTVSIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Amaro Džives - Náš deň',
|
||||
'description': 'Galavečer pri príležitosti Medzinárodného dňa Rómov.',
|
||||
'thumbnail': 'https://www.rtvs.sk/media/a501/image/file/2/0031/L7Qm.amaro_dzives_png.jpg',
|
||||
'thumbnail': 'https://www.stvr.sk/media/a501/image/file/2/0031/L7Qm.amaro_dzives_png.jpg',
|
||||
'timestamp': 1428555900,
|
||||
'upload_date': '20150409',
|
||||
'duration': 4986,
|
||||
@ -47,8 +49,11 @@ class RTVSIE(InfoExtractor):
|
||||
'display_id': '307655',
|
||||
'duration': 831,
|
||||
'upload_date': '20211111',
|
||||
'thumbnail': 'https://www.rtvs.sk/media/a501/image/file/2/0916/robin.jpg',
|
||||
'thumbnail': 'https://www.stvr.sk/media/a501/image/file/2/0916/robin.jpg',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.stvr.sk/radio/archiv/11224/414872',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
@ -7,7 +7,6 @@
|
||||
ExtractorError,
|
||||
UnsupportedError,
|
||||
clean_html,
|
||||
determine_ext,
|
||||
extract_attributes,
|
||||
format_field,
|
||||
get_element_by_class,
|
||||
@ -36,7 +35,7 @@ class RumbleEmbedIE(InfoExtractor):
|
||||
'upload_date': '20191020',
|
||||
'channel_url': 'https://rumble.com/c/WMAR',
|
||||
'channel': 'WMAR',
|
||||
'thumbnail': 'https://sp.rmbl.ws/s8/1/5/M/z/1/5Mz1a.qR4e-small-WMAR-2-News-Latest-Headline.jpg',
|
||||
'thumbnail': r're:https://.+\.jpg',
|
||||
'duration': 234,
|
||||
'uploader': 'WMAR',
|
||||
'live_status': 'not_live',
|
||||
@ -52,7 +51,7 @@ class RumbleEmbedIE(InfoExtractor):
|
||||
'upload_date': '20220217',
|
||||
'channel_url': 'https://rumble.com/c/CyberTechNews',
|
||||
'channel': 'CTNews',
|
||||
'thumbnail': 'https://sp.rmbl.ws/s8/6/7/i/9/h/7i9hd.OvCc.jpg',
|
||||
'thumbnail': r're:https://.+\.jpg',
|
||||
'duration': 901,
|
||||
'uploader': 'CTNews',
|
||||
'live_status': 'not_live',
|
||||
@ -114,6 +113,22 @@ class RumbleEmbedIE(InfoExtractor):
|
||||
'live_status': 'was_live',
|
||||
},
|
||||
'params': {'skip_download': True},
|
||||
}, {
|
||||
'url': 'https://rumble.com/embed/v6pezdb',
|
||||
'info_dict': {
|
||||
'id': 'v6pezdb',
|
||||
'ext': 'mp4',
|
||||
'title': '"Es war einmal ein Mädchen" – Ein filmisches Zeitzeugnis aus Leningrad 1944',
|
||||
'uploader': 'RT DE',
|
||||
'channel': 'RT DE',
|
||||
'channel_url': 'https://rumble.com/c/RTDE',
|
||||
'duration': 309,
|
||||
'thumbnail': 'https://1a-1791.com/video/fww1/dc/s8/1/n/z/2/y/nz2yy.qR4e-small-Es-war-einmal-ein-Mdchen-Ei.jpg',
|
||||
'timestamp': 1743703500,
|
||||
'upload_date': '20250403',
|
||||
'live_status': 'not_live',
|
||||
},
|
||||
'params': {'skip_download': True},
|
||||
}, {
|
||||
'url': 'https://rumble.com/embed/ufe9n.v5pv5f',
|
||||
'only_matching': True,
|
||||
@ -168,40 +183,42 @@ def _real_extract(self, url):
|
||||
live_status = None
|
||||
|
||||
formats = []
|
||||
for ext, ext_info in (video.get('ua') or {}).items():
|
||||
if isinstance(ext_info, dict):
|
||||
for height, video_info in ext_info.items():
|
||||
for format_type, format_info in (video.get('ua') or {}).items():
|
||||
if isinstance(format_info, dict):
|
||||
for height, video_info in format_info.items():
|
||||
if not traverse_obj(video_info, ('meta', 'h', {int_or_none})):
|
||||
video_info.setdefault('meta', {})['h'] = height
|
||||
ext_info = ext_info.values()
|
||||
format_info = format_info.values()
|
||||
|
||||
for video_info in ext_info:
|
||||
for video_info in format_info:
|
||||
meta = video_info.get('meta') or {}
|
||||
if not video_info.get('url'):
|
||||
continue
|
||||
if ext == 'hls':
|
||||
# With default query params returns m3u8 variants which are duplicates, without returns tar files
|
||||
if format_type == 'tar':
|
||||
continue
|
||||
if format_type == 'hls':
|
||||
if meta.get('live') is True and video.get('live') == 1:
|
||||
live_status = 'post_live'
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
video_info['url'], video_id,
|
||||
ext='mp4', m3u8_id='hls', fatal=False, live=live_status == 'is_live'))
|
||||
continue
|
||||
timeline = ext == 'timeline'
|
||||
if timeline:
|
||||
ext = determine_ext(video_info['url'])
|
||||
is_timeline = format_type == 'timeline'
|
||||
is_audio = format_type == 'audio'
|
||||
formats.append({
|
||||
'ext': ext,
|
||||
'acodec': 'none' if timeline else None,
|
||||
'acodec': 'none' if is_timeline else None,
|
||||
'vcodec': 'none' if is_audio else None,
|
||||
'url': video_info['url'],
|
||||
'format_id': join_nonempty(ext, format_field(meta, 'h', '%sp')),
|
||||
'format_note': 'Timeline' if timeline else None,
|
||||
'fps': None if timeline else video.get('fps'),
|
||||
'format_id': join_nonempty(format_type, format_field(meta, 'h', '%sp')),
|
||||
'format_note': 'Timeline' if is_timeline else None,
|
||||
'fps': None if is_timeline or is_audio else video.get('fps'),
|
||||
**traverse_obj(meta, {
|
||||
'tbr': 'bitrate',
|
||||
'filesize': 'size',
|
||||
'width': 'w',
|
||||
'height': 'h',
|
||||
}, expected_type=lambda x: int(x) or None),
|
||||
'tbr': ('bitrate', {int_or_none}),
|
||||
'filesize': ('size', {int_or_none}),
|
||||
'width': ('w', {int_or_none}),
|
||||
'height': ('h', {int_or_none}),
|
||||
}),
|
||||
})
|
||||
|
||||
subtitles = {
|
||||
|
@ -122,6 +122,15 @@ def _real_extract(self, url):
|
||||
if traverse_obj(media, ('partOfSeries', {dict})):
|
||||
media['epName'] = traverse_obj(media, ('title', {str}))
|
||||
|
||||
# Need to set different language for forced subs or else they have priority over full subs
|
||||
fixed_subtitles = {}
|
||||
for lang, subs in subtitles.items():
|
||||
for sub in subs:
|
||||
fixed_lang = lang
|
||||
if sub['url'].lower().endswith('_fe.vtt'):
|
||||
fixed_lang += '-forced'
|
||||
fixed_subtitles.setdefault(fixed_lang, []).append(sub)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
**traverse_obj(media, {
|
||||
@ -151,6 +160,6 @@ def _real_extract(self, url):
|
||||
}),
|
||||
}),
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'subtitles': fixed_subtitles,
|
||||
'uploader': 'SBSC',
|
||||
}
|
||||
|
@ -14,19 +14,20 @@
|
||||
dict_get,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
join_nonempty,
|
||||
make_archive_id,
|
||||
parse_duration,
|
||||
parse_iso8601,
|
||||
parse_qs,
|
||||
qualities,
|
||||
str_or_none,
|
||||
traverse_obj,
|
||||
try_get,
|
||||
unified_timestamp,
|
||||
update_url_query,
|
||||
url_or_none,
|
||||
urljoin,
|
||||
)
|
||||
from ..utils.traversal import traverse_obj, value
|
||||
|
||||
|
||||
class TwitchBaseIE(InfoExtractor):
|
||||
@ -42,10 +43,10 @@ class TwitchBaseIE(InfoExtractor):
|
||||
'CollectionSideBar': '27111f1b382effad0b6def325caef1909c733fe6a4fbabf54f8d491ef2cf2f14',
|
||||
'FilterableVideoTower_Videos': 'a937f1d22e269e39a03b509f65a7490f9fc247d7f83d6ac1421523e3b68042cb',
|
||||
'ClipsCards__User': 'b73ad2bfaecfd30a9e6c28fada15bd97032c83ec77a0440766a56fe0bd632777',
|
||||
'ShareClipRenderStatus': 'f130048a462a0ac86bb54d653c968c514e9ab9ca94db52368c1179e97b0f16eb',
|
||||
'ChannelCollectionsContent': '447aec6a0cc1e8d0a8d7732d47eb0762c336a2294fdb009e9c9d854e49d484b9',
|
||||
'StreamMetadata': 'a647c2a13599e5991e175155f798ca7f1ecddde73f7f341f39009c14dbf59962',
|
||||
'ComscoreStreamingQuery': 'e1edae8122517d013405f237ffcc124515dc6ded82480a88daef69c83b53ac01',
|
||||
'VideoAccessToken_Clip': '36b89d2507fce29e5ca551df756d27c1cfe079e2609642b4390aa4c35796eb11',
|
||||
'VideoPreviewOverlay': '3006e77e51b128d838fa4e835723ca4dc9a05c5efd4466c1085215c6e437e65c',
|
||||
'VideoMetadata': '49b5b8f268cdeb259d75b58dcb0c1a748e3b575003448a2333dc5cdafd49adad',
|
||||
'VideoPlayer_ChapterSelectButtonVideo': '8d2793384aac3773beab5e59bd5d6f585aedb923d292800119e03d40cd0f9b41',
|
||||
@ -1083,16 +1084,44 @@ class TwitchClipsIE(TwitchBaseIE):
|
||||
'url': 'https://clips.twitch.tv/FaintLightGullWholeWheat',
|
||||
'md5': '761769e1eafce0ffebfb4089cb3847cd',
|
||||
'info_dict': {
|
||||
'id': '42850523',
|
||||
'id': '396245304',
|
||||
'display_id': 'FaintLightGullWholeWheat',
|
||||
'ext': 'mp4',
|
||||
'title': 'EA Play 2016 Live from the Novo Theatre',
|
||||
'duration': 32,
|
||||
'view_count': int,
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'timestamp': 1465767393,
|
||||
'upload_date': '20160612',
|
||||
'creator': 'EA',
|
||||
'uploader': 'stereotype_',
|
||||
'uploader_id': '43566419',
|
||||
'creators': ['EA'],
|
||||
'channel': 'EA',
|
||||
'channel_id': '25163635',
|
||||
'channel_is_verified': False,
|
||||
'channel_follower_count': int,
|
||||
'uploader': 'EA',
|
||||
'uploader_id': '25163635',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.twitch.tv/xqc/clip/CulturedAmazingKuduDatSheffy-TiZ_-ixAGYR3y2Uy',
|
||||
'md5': 'e90fe616b36e722a8cfa562547c543f0',
|
||||
'info_dict': {
|
||||
'id': '3207364882',
|
||||
'display_id': 'CulturedAmazingKuduDatSheffy-TiZ_-ixAGYR3y2Uy',
|
||||
'ext': 'mp4',
|
||||
'title': 'A day in the life of xQc',
|
||||
'duration': 60,
|
||||
'view_count': int,
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'timestamp': 1742869615,
|
||||
'upload_date': '20250325',
|
||||
'creators': ['xQc'],
|
||||
'channel': 'xQc',
|
||||
'channel_id': '71092938',
|
||||
'channel_is_verified': True,
|
||||
'channel_follower_count': int,
|
||||
'uploader': 'xQc',
|
||||
'uploader_id': '71092938',
|
||||
'categories': ['Just Chatting'],
|
||||
},
|
||||
}, {
|
||||
# multiple formats
|
||||
@ -1116,16 +1145,14 @@ class TwitchClipsIE(TwitchBaseIE):
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
slug = self._match_id(url)
|
||||
|
||||
clip = self._download_gql(
|
||||
video_id, [{
|
||||
'operationName': 'VideoAccessToken_Clip',
|
||||
'variables': {
|
||||
'slug': video_id,
|
||||
},
|
||||
slug, [{
|
||||
'operationName': 'ShareClipRenderStatus',
|
||||
'variables': {'slug': slug},
|
||||
}],
|
||||
'Downloading clip access token GraphQL')[0]['data']['clip']
|
||||
'Downloading clip GraphQL')[0]['data']['clip']
|
||||
|
||||
if not clip:
|
||||
raise ExtractorError(
|
||||
@ -1135,81 +1162,71 @@ def _real_extract(self, url):
|
||||
'sig': clip['playbackAccessToken']['signature'],
|
||||
'token': clip['playbackAccessToken']['value'],
|
||||
}
|
||||
|
||||
data = self._download_base_gql(
|
||||
video_id, {
|
||||
'query': '''{
|
||||
clip(slug: "%s") {
|
||||
broadcaster {
|
||||
displayName
|
||||
}
|
||||
createdAt
|
||||
curator {
|
||||
displayName
|
||||
id
|
||||
}
|
||||
durationSeconds
|
||||
id
|
||||
tiny: thumbnailURL(width: 86, height: 45)
|
||||
small: thumbnailURL(width: 260, height: 147)
|
||||
medium: thumbnailURL(width: 480, height: 272)
|
||||
title
|
||||
videoQualities {
|
||||
frameRate
|
||||
quality
|
||||
sourceURL
|
||||
}
|
||||
viewCount
|
||||
}
|
||||
}''' % video_id}, 'Downloading clip GraphQL', fatal=False) # noqa: UP031
|
||||
|
||||
if data:
|
||||
clip = try_get(data, lambda x: x['data']['clip'], dict) or clip
|
||||
asset_default = traverse_obj(clip, ('assets', 0, {dict})) or {}
|
||||
asset_portrait = traverse_obj(clip, ('assets', 1, {dict})) or {}
|
||||
|
||||
formats = []
|
||||
for option in clip.get('videoQualities', []):
|
||||
if not isinstance(option, dict):
|
||||
continue
|
||||
source = url_or_none(option.get('sourceURL'))
|
||||
if not source:
|
||||
continue
|
||||
default_aspect_ratio = float_or_none(asset_default.get('aspectRatio'))
|
||||
formats.extend(traverse_obj(asset_default, ('videoQualities', lambda _, v: url_or_none(v['sourceURL']), {
|
||||
'url': ('sourceURL', {update_url_query(query=access_query)}),
|
||||
'format_id': ('quality', {str}),
|
||||
'height': ('quality', {int_or_none}),
|
||||
'fps': ('frameRate', {float_or_none}),
|
||||
'aspect_ratio': {value(default_aspect_ratio)},
|
||||
})))
|
||||
portrait_aspect_ratio = float_or_none(asset_portrait.get('aspectRatio'))
|
||||
for source in traverse_obj(asset_portrait, ('videoQualities', lambda _, v: url_or_none(v['sourceURL']))):
|
||||
formats.append({
|
||||
'url': update_url_query(source, access_query),
|
||||
'format_id': option.get('quality'),
|
||||
'height': int_or_none(option.get('quality')),
|
||||
'fps': int_or_none(option.get('frameRate')),
|
||||
'url': update_url_query(source['sourceURL'], access_query),
|
||||
'format_id': join_nonempty('portrait', source.get('quality')),
|
||||
'height': int_or_none(source.get('quality')),
|
||||
'fps': float_or_none(source.get('frameRate')),
|
||||
'aspect_ratio': portrait_aspect_ratio,
|
||||
'quality': -2,
|
||||
})
|
||||
|
||||
thumbnails = []
|
||||
for thumbnail_id in ('tiny', 'small', 'medium'):
|
||||
thumbnail_url = clip.get(thumbnail_id)
|
||||
if not thumbnail_url:
|
||||
continue
|
||||
thumb = {
|
||||
'id': thumbnail_id,
|
||||
'url': thumbnail_url,
|
||||
}
|
||||
mobj = re.search(r'-(\d+)x(\d+)\.', thumbnail_url)
|
||||
if mobj:
|
||||
thumb.update({
|
||||
'height': int(mobj.group(2)),
|
||||
'width': int(mobj.group(1)),
|
||||
})
|
||||
thumbnails.append(thumb)
|
||||
thumb_asset_default_url = url_or_none(asset_default.get('thumbnailURL'))
|
||||
if thumb_asset_default_url:
|
||||
thumbnails.append({
|
||||
'id': 'default',
|
||||
'url': thumb_asset_default_url,
|
||||
'preference': 0,
|
||||
})
|
||||
if thumb_asset_portrait_url := url_or_none(asset_portrait.get('thumbnailURL')):
|
||||
thumbnails.append({
|
||||
'id': 'portrait',
|
||||
'url': thumb_asset_portrait_url,
|
||||
'preference': -1,
|
||||
})
|
||||
thumb_default_url = url_or_none(clip.get('thumbnailURL'))
|
||||
if thumb_default_url and thumb_default_url != thumb_asset_default_url:
|
||||
thumbnails.append({
|
||||
'id': 'small',
|
||||
'url': thumb_default_url,
|
||||
'preference': -2,
|
||||
})
|
||||
|
||||
old_id = self._search_regex(r'%7C(\d+)(?:-\d+)?.mp4', formats[-1]['url'], 'old id', default=None)
|
||||
|
||||
return {
|
||||
'id': clip.get('id') or video_id,
|
||||
'id': clip.get('id') or slug,
|
||||
'_old_archive_ids': [make_archive_id(self, old_id)] if old_id else None,
|
||||
'display_id': video_id,
|
||||
'title': clip.get('title'),
|
||||
'display_id': slug,
|
||||
'formats': formats,
|
||||
'duration': int_or_none(clip.get('durationSeconds')),
|
||||
'view_count': int_or_none(clip.get('viewCount')),
|
||||
'timestamp': unified_timestamp(clip.get('createdAt')),
|
||||
'thumbnails': thumbnails,
|
||||
'creator': try_get(clip, lambda x: x['broadcaster']['displayName'], str),
|
||||
'uploader': try_get(clip, lambda x: x['curator']['displayName'], str),
|
||||
'uploader_id': try_get(clip, lambda x: x['curator']['id'], str),
|
||||
**traverse_obj(clip, {
|
||||
'title': ('title', {str}),
|
||||
'duration': ('durationSeconds', {int_or_none}),
|
||||
'view_count': ('viewCount', {int_or_none}),
|
||||
'timestamp': ('createdAt', {parse_iso8601}),
|
||||
'creators': ('broadcaster', 'displayName', {str}, filter, all),
|
||||
'channel': ('broadcaster', 'displayName', {str}),
|
||||
'channel_id': ('broadcaster', 'id', {str}),
|
||||
'channel_follower_count': ('broadcaster', 'followers', 'totalCount', {int_or_none}),
|
||||
'channel_is_verified': ('broadcaster', 'isPartner', {bool}),
|
||||
'uploader': ('broadcaster', 'displayName', {str}),
|
||||
'uploader_id': ('broadcaster', 'id', {str}),
|
||||
'categories': ('game', 'displayName', {str}, filter, all, filter),
|
||||
}),
|
||||
}
|
||||
|
@ -544,7 +544,7 @@ def _real_extract(self, url):
|
||||
'uploader_id': (('author_id', 'authorId'), {str_or_none}, any),
|
||||
'duration': ('duration', {int_or_none}),
|
||||
'chapters': ('time_codes', lambda _, v: isinstance(v['time'], int), {
|
||||
'title': ('text', {str}),
|
||||
'title': ('text', {unescapeHTML}),
|
||||
'start_time': 'time',
|
||||
}),
|
||||
}),
|
||||
|
@ -2,15 +2,17 @@
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
bug_reports_message,
|
||||
determine_ext,
|
||||
extract_attributes,
|
||||
int_or_none,
|
||||
lowercase_escape,
|
||||
parse_qs,
|
||||
traverse_obj,
|
||||
qualities,
|
||||
try_get,
|
||||
update_url_query,
|
||||
url_or_none,
|
||||
)
|
||||
from ..utils.traversal import traverse_obj
|
||||
|
||||
|
||||
class YandexVideoIE(InfoExtractor):
|
||||
@ -186,7 +188,22 @@ def _real_extract(self, url):
|
||||
return self.url_result(data_json['video']['url'])
|
||||
|
||||
|
||||
class ZenYandexIE(InfoExtractor):
|
||||
class ZenYandexBaseIE(InfoExtractor):
|
||||
def _fetch_ssr_data(self, url, video_id):
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
redirect = self._search_json(
|
||||
r'(?:var|let|const)\s+it\s*=', webpage, 'redirect', video_id, default={}).get('retpath')
|
||||
if redirect:
|
||||
video_id = self._match_id(redirect)
|
||||
webpage = self._download_webpage(redirect, video_id, note='Redirecting')
|
||||
return video_id, self._search_json(
|
||||
r'(?:var|let|const)\s+_params\s*=\s*\(', webpage, 'metadata', video_id,
|
||||
contains_pattern=r'{["\']ssrData.+}')['ssrData']
|
||||
|
||||
|
||||
class ZenYandexIE(ZenYandexBaseIE):
|
||||
IE_NAME = 'dzen.ru'
|
||||
IE_DESC = 'Дзен (dzen) formerly Яндекс.Дзен (Yandex Zen)'
|
||||
_VALID_URL = r'https?://(zen\.yandex|dzen)\.ru(?:/video)?/(media|watch)/(?:(?:id/[^/]+/|[^/]+/)(?:[a-z0-9-]+)-)?(?P<id>[a-z0-9-]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://zen.yandex.ru/media/id/606fd806cc13cb3c58c05cf5/vot-eto-focus-dedy-morozy-na-gidrociklah-60c7c443da18892ebfe85ed7',
|
||||
@ -216,6 +233,7 @@ class ZenYandexIE(InfoExtractor):
|
||||
'timestamp': 1573465585,
|
||||
},
|
||||
'params': {'skip_download': 'm3u8'},
|
||||
'skip': 'The page does not exist',
|
||||
}, {
|
||||
'url': 'https://zen.yandex.ru/video/watch/6002240ff8b1af50bb2da5e3',
|
||||
'info_dict': {
|
||||
@ -227,6 +245,9 @@ class ZenYandexIE(InfoExtractor):
|
||||
'uploader': 'TechInsider',
|
||||
'timestamp': 1611378221,
|
||||
'upload_date': '20210123',
|
||||
'view_count': int,
|
||||
'duration': 243,
|
||||
'tags': ['опыт', 'эксперимент', 'огонь'],
|
||||
},
|
||||
'params': {'skip_download': 'm3u8'},
|
||||
}, {
|
||||
@ -240,6 +261,9 @@ class ZenYandexIE(InfoExtractor):
|
||||
'uploader': 'TechInsider',
|
||||
'upload_date': '20210123',
|
||||
'timestamp': 1611378221,
|
||||
'view_count': int,
|
||||
'duration': 243,
|
||||
'tags': ['опыт', 'эксперимент', 'огонь'],
|
||||
},
|
||||
'params': {'skip_download': 'm3u8'},
|
||||
}, {
|
||||
@ -252,44 +276,56 @@ class ZenYandexIE(InfoExtractor):
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
redirect = self._search_json(r'var it\s*=', webpage, 'redirect', id, default={}).get('retpath')
|
||||
if redirect:
|
||||
video_id = self._match_id(redirect)
|
||||
webpage = self._download_webpage(redirect, video_id, note='Redirecting')
|
||||
data_json = self._search_json(
|
||||
r'("data"\s*:|data\s*=)', webpage, 'metadata', video_id, contains_pattern=r'{["\']_*serverState_*video.+}')
|
||||
serverstate = self._search_regex(r'(_+serverState_+video-site_[^_]+_+)', webpage, 'server state')
|
||||
uploader = self._search_regex(r'(<a\s*class=["\']card-channel-link[^"\']+["\'][^>]+>)',
|
||||
webpage, 'uploader', default='<a>')
|
||||
uploader_name = extract_attributes(uploader).get('aria-label')
|
||||
item_id = traverse_obj(data_json, (serverstate, 'videoViewer', 'openedItemId', {str}))
|
||||
video_json = traverse_obj(data_json, (serverstate, 'videoViewer', 'items', item_id, {dict})) or {}
|
||||
video_id, ssr_data = self._fetch_ssr_data(url, video_id)
|
||||
video_data = ssr_data['videoMetaResponse']
|
||||
|
||||
formats, subtitles = [], {}
|
||||
for s_url in traverse_obj(video_json, ('video', 'streams', ..., {url_or_none})):
|
||||
quality = qualities(('4', '0', '1', '2', '3', '5', '6', '7'))
|
||||
# Deduplicate stream URLs. The "dzen_dash" query parameter is present in some URLs but can be omitted
|
||||
stream_urls = set(traverse_obj(video_data, (
|
||||
'video', ('id', ('streams', ...), ('mp4Streams', ..., 'url'), ('oneVideoStreams', ..., 'url')),
|
||||
{url_or_none}, {update_url_query(query={'dzen_dash': []})})))
|
||||
for s_url in stream_urls:
|
||||
ext = determine_ext(s_url)
|
||||
if ext == 'mpd':
|
||||
fmts, subs = self._extract_mpd_formats_and_subtitles(s_url, video_id, mpd_id='dash')
|
||||
elif ext == 'm3u8':
|
||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(s_url, video_id, 'mp4')
|
||||
content_type = traverse_obj(parse_qs(s_url), ('ct', 0))
|
||||
if ext == 'mpd' or content_type == '6':
|
||||
fmts, subs = self._extract_mpd_formats_and_subtitles(s_url, video_id, mpd_id='dash', fatal=False)
|
||||
elif ext == 'm3u8' or content_type == '8':
|
||||
fmts, subs = self._extract_m3u8_formats_and_subtitles(s_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
|
||||
elif content_type == '0':
|
||||
format_type = traverse_obj(parse_qs(s_url), ('type', 0))
|
||||
formats.append({
|
||||
'url': s_url,
|
||||
'format_id': format_type,
|
||||
'ext': 'mp4',
|
||||
'quality': quality(format_type),
|
||||
})
|
||||
continue
|
||||
else:
|
||||
self.report_warning(f'Unsupported stream URL: {s_url}{bug_reports_message()}')
|
||||
continue
|
||||
formats.extend(fmts)
|
||||
subtitles = self._merge_subtitles(subtitles, subs)
|
||||
self._merge_subtitles(subs, target=subtitles)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': video_json.get('title') or self._og_search_title(webpage),
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'duration': int_or_none(video_json.get('duration')),
|
||||
'view_count': int_or_none(video_json.get('views')),
|
||||
'timestamp': int_or_none(video_json.get('publicationDate')),
|
||||
'uploader': uploader_name or data_json.get('authorName') or try_get(data_json, lambda x: x['publisher']['name']),
|
||||
'description': video_json.get('description') or self._og_search_description(webpage),
|
||||
'thumbnail': self._og_search_thumbnail(webpage) or try_get(data_json, lambda x: x['og']['imageUrl']),
|
||||
**traverse_obj(video_data, {
|
||||
'title': ('title', {str}),
|
||||
'description': ('description', {str}),
|
||||
'thumbnail': ('image', {url_or_none}),
|
||||
'duration': ('video', 'duration', {int_or_none}),
|
||||
'view_count': ('video', 'views', {int_or_none}),
|
||||
'timestamp': ('publicationDate', {int_or_none}),
|
||||
'tags': ('tags', ..., {str}),
|
||||
'uploader': ('source', 'title', {str}),
|
||||
}),
|
||||
}
|
||||
|
||||
|
||||
class ZenYandexChannelIE(InfoExtractor):
|
||||
class ZenYandexChannelIE(ZenYandexBaseIE):
|
||||
IE_NAME = 'dzen.ru:channel'
|
||||
_VALID_URL = r'https?://(zen\.yandex|dzen)\.ru/(?!media|video)(?:id/)?(?P<id>[a-z0-9-_]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://zen.yandex.ru/tok_media',
|
||||
@ -323,8 +359,8 @@ class ZenYandexChannelIE(InfoExtractor):
|
||||
'url': 'https://zen.yandex.ru/jony_me',
|
||||
'info_dict': {
|
||||
'id': 'jony_me',
|
||||
'description': 'md5:ce0a5cad2752ab58701b5497835b2cc5',
|
||||
'title': 'JONY ',
|
||||
'description': 'md5:7c30d11dc005faba8826feae99da3113',
|
||||
'title': 'JONY',
|
||||
},
|
||||
'playlist_count': 18,
|
||||
}, {
|
||||
@ -333,9 +369,8 @@ class ZenYandexChannelIE(InfoExtractor):
|
||||
'url': 'https://zen.yandex.ru/tatyanareva',
|
||||
'info_dict': {
|
||||
'id': 'tatyanareva',
|
||||
'description': 'md5:40a1e51f174369ec3ba9d657734ac31f',
|
||||
'description': 'md5:92e56fa730a932ca2483ba5c2186ad96',
|
||||
'title': 'Татьяна Рева',
|
||||
'entries': 'maxcount:200',
|
||||
},
|
||||
'playlist_mincount': 46,
|
||||
}, {
|
||||
@ -348,43 +383,31 @@ class ZenYandexChannelIE(InfoExtractor):
|
||||
'playlist_mincount': 657,
|
||||
}]
|
||||
|
||||
def _entries(self, item_id, server_state_json, server_settings_json):
|
||||
items = (traverse_obj(server_state_json, ('feed', 'items', ...))
|
||||
or traverse_obj(server_settings_json, ('exportData', 'items', ...)))
|
||||
|
||||
more = (traverse_obj(server_state_json, ('links', 'more'))
|
||||
or traverse_obj(server_settings_json, ('exportData', 'more', 'link')))
|
||||
|
||||
def _entries(self, feed_data, channel_id):
|
||||
next_page_id = None
|
||||
for page in itertools.count(1):
|
||||
for item in items or []:
|
||||
if item.get('type') != 'gif':
|
||||
continue
|
||||
video_id = traverse_obj(item, 'publication_id', 'publicationId') or ''
|
||||
yield self.url_result(item['link'], ZenYandexIE, video_id.split(':')[-1])
|
||||
for item in traverse_obj(feed_data, (
|
||||
(None, ('items', lambda _, v: v['tab'] in ('shorts', 'longs'))),
|
||||
'items', lambda _, v: url_or_none(v['link']),
|
||||
)):
|
||||
yield self.url_result(item['link'], ZenYandexIE, item.get('id'), title=item.get('title'))
|
||||
|
||||
more = traverse_obj(feed_data, ('more', 'link', {url_or_none}))
|
||||
current_page_id = next_page_id
|
||||
next_page_id = traverse_obj(parse_qs(more), ('next_page_id', -1))
|
||||
if not all((more, items, next_page_id, next_page_id != current_page_id)):
|
||||
if not all((more, next_page_id, next_page_id != current_page_id)):
|
||||
break
|
||||
|
||||
data = self._download_json(more, item_id, note=f'Downloading Page {page}')
|
||||
items, more = data.get('items'), traverse_obj(data, ('more', 'link'))
|
||||
feed_data = self._download_json(more, channel_id, note=f'Downloading Page {page}')
|
||||
|
||||
def _real_extract(self, url):
|
||||
item_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, item_id)
|
||||
redirect = self._search_json(
|
||||
r'var it\s*=', webpage, 'redirect', item_id, default={}).get('retpath')
|
||||
if redirect:
|
||||
item_id = self._match_id(redirect)
|
||||
webpage = self._download_webpage(redirect, item_id, note='Redirecting')
|
||||
data = self._search_json(
|
||||
r'("data"\s*:|data\s*=)', webpage, 'channel data', item_id, contains_pattern=r'{\"__serverState__.+}')
|
||||
server_state_json = traverse_obj(data, lambda k, _: k.startswith('__serverState__'), get_all=False)
|
||||
server_settings_json = traverse_obj(data, lambda k, _: k.startswith('__serverSettings__'), get_all=False)
|
||||
channel_id = self._match_id(url)
|
||||
channel_id, ssr_data = self._fetch_ssr_data(url, channel_id)
|
||||
channel_data = ssr_data['exportResponse']
|
||||
|
||||
return self.playlist_result(
|
||||
self._entries(item_id, server_state_json, server_settings_json),
|
||||
item_id, traverse_obj(server_state_json, ('channel', 'source', 'title')),
|
||||
traverse_obj(server_state_json, ('channel', 'source', 'description')))
|
||||
self._entries(channel_data['feedData'], channel_id),
|
||||
channel_id, **traverse_obj(channel_data, ('channel', 'source', {
|
||||
'title': ('title', {str}),
|
||||
'description': ('description', {str}),
|
||||
})))
|
||||
|
@ -803,12 +803,14 @@ def _extract_next_continuation_data(cls, renderer):
|
||||
|
||||
@classmethod
|
||||
def _extract_continuation_ep_data(cls, continuation_ep: dict):
|
||||
if isinstance(continuation_ep, dict):
|
||||
continuation = try_get(
|
||||
continuation_ep, lambda x: x['continuationCommand']['token'], str)
|
||||
continuation_commands = traverse_obj(
|
||||
continuation_ep, ('commandExecutorCommand', 'commands', ..., {dict}))
|
||||
continuation_commands.append(continuation_ep)
|
||||
for command in continuation_commands:
|
||||
continuation = traverse_obj(command, ('continuationCommand', 'token', {str}))
|
||||
if not continuation:
|
||||
return
|
||||
ctp = continuation_ep.get('clickTrackingParams')
|
||||
continue
|
||||
ctp = command.get('clickTrackingParams')
|
||||
return cls._build_api_continuation_query(continuation, ctp)
|
||||
|
||||
@classmethod
|
||||
|
@ -1761,6 +1761,16 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
},
|
||||
]
|
||||
|
||||
_PLAYER_JS_VARIANT_MAP = {
|
||||
'main': 'player_ias.vflset/en_US/base.js',
|
||||
'tce': 'player_ias_tce.vflset/en_US/base.js',
|
||||
'tv': 'tv-player-ias.vflset/tv-player-ias.js',
|
||||
'tv_es6': 'tv-player-es6.vflset/tv-player-es6.js',
|
||||
'phone': 'player-plasma-ias-phone-en_US.vflset/base.js',
|
||||
'tablet': 'player-plasma-ias-tablet-en_US.vflset/base.js',
|
||||
}
|
||||
_INVERSE_PLAYER_JS_VARIANT_MAP = {v: k for k, v in _PLAYER_JS_VARIANT_MAP.items()}
|
||||
|
||||
@classmethod
|
||||
def suitable(cls, url):
|
||||
from yt_dlp.utils import parse_qs
|
||||
@ -1940,6 +1950,21 @@ def _extract_player_url(self, *ytcfgs, webpage=None):
|
||||
get_all=False, expected_type=str)
|
||||
if not player_url:
|
||||
return
|
||||
|
||||
requested_js_variant = self._configuration_arg('player_js_variant', [''])[0] or 'actual'
|
||||
if requested_js_variant in self._PLAYER_JS_VARIANT_MAP:
|
||||
player_id = self._extract_player_info(player_url)
|
||||
original_url = player_url
|
||||
player_url = f'/s/player/{player_id}/{self._PLAYER_JS_VARIANT_MAP[requested_js_variant]}'
|
||||
if original_url != player_url:
|
||||
self.write_debug(
|
||||
f'Forcing "{requested_js_variant}" player JS variant for player {player_id}\n'
|
||||
f' original url = {original_url}', only_once=True)
|
||||
elif requested_js_variant != 'actual':
|
||||
self.report_warning(
|
||||
f'Invalid player JS variant name "{requested_js_variant}" requested. '
|
||||
f'Valid choices are: {", ".join(self._PLAYER_JS_VARIANT_MAP)}', only_once=True)
|
||||
|
||||
return urljoin('https://www.youtube.com', player_url)
|
||||
|
||||
def _download_player_url(self, video_id, fatal=False):
|
||||
@ -1954,6 +1979,17 @@ def _download_player_url(self, video_id, fatal=False):
|
||||
if player_version:
|
||||
return f'https://www.youtube.com/s/player/{player_version}/player_ias.vflset/en_US/base.js'
|
||||
|
||||
def _player_js_cache_key(self, player_url):
|
||||
player_id = self._extract_player_info(player_url)
|
||||
player_path = remove_start(urllib.parse.urlparse(player_url).path, f'/s/player/{player_id}/')
|
||||
variant = self._INVERSE_PLAYER_JS_VARIANT_MAP.get(player_path)
|
||||
if not variant:
|
||||
self.write_debug(
|
||||
f'Unable to determine player JS variant\n'
|
||||
f' player = {player_url}', only_once=True)
|
||||
variant = re.sub(r'[^a-zA-Z0-9]', '_', remove_end(player_path, '.js'))
|
||||
return join_nonempty(player_id, variant)
|
||||
|
||||
def _signature_cache_id(self, example_sig):
|
||||
""" Return a string representation of a signature """
|
||||
return '.'.join(str(len(part)) for part in example_sig.split('.'))
|
||||
@ -1969,25 +2005,24 @@ def _extract_player_info(cls, player_url):
|
||||
return id_m.group('id')
|
||||
|
||||
def _load_player(self, video_id, player_url, fatal=True):
|
||||
player_id = self._extract_player_info(player_url)
|
||||
if player_id not in self._code_cache:
|
||||
player_js_key = self._player_js_cache_key(player_url)
|
||||
if player_js_key not in self._code_cache:
|
||||
code = self._download_webpage(
|
||||
player_url, video_id, fatal=fatal,
|
||||
note='Downloading player ' + player_id,
|
||||
errnote=f'Download of {player_url} failed')
|
||||
note=f'Downloading player {player_js_key}',
|
||||
errnote=f'Download of {player_js_key} failed')
|
||||
if code:
|
||||
self._code_cache[player_id] = code
|
||||
return self._code_cache.get(player_id)
|
||||
self._code_cache[player_js_key] = code
|
||||
return self._code_cache.get(player_js_key)
|
||||
|
||||
def _extract_signature_function(self, video_id, player_url, example_sig):
|
||||
player_id = self._extract_player_info(player_url)
|
||||
|
||||
# Read from filesystem cache
|
||||
func_id = f'js_{player_id}_{self._signature_cache_id(example_sig)}'
|
||||
func_id = join_nonempty(
|
||||
self._player_js_cache_key(player_url), self._signature_cache_id(example_sig))
|
||||
assert os.path.basename(func_id) == func_id
|
||||
|
||||
self.write_debug(f'Extracting signature function {func_id}')
|
||||
cache_spec, code = self.cache.load('youtube-sigfuncs', func_id, min_ver='2025.03.27'), None
|
||||
cache_spec, code = self.cache.load('youtube-sigfuncs', func_id, min_ver='2025.03.31'), None
|
||||
|
||||
if not cache_spec:
|
||||
code = self._load_player(video_id, player_url)
|
||||
@ -2085,22 +2120,22 @@ def inner(*args, **kwargs):
|
||||
return ret
|
||||
return inner
|
||||
|
||||
def _load_nsig_code_from_cache(self, player_id):
|
||||
cache_id = ('nsig code', player_id)
|
||||
def _load_nsig_code_from_cache(self, player_url):
|
||||
cache_id = ('youtube-nsig', self._player_js_cache_key(player_url))
|
||||
|
||||
if func_code := self._player_cache.get(cache_id):
|
||||
return func_code
|
||||
|
||||
func_code = self.cache.load('youtube-nsig', player_id, min_ver='2025.03.27')
|
||||
func_code = self.cache.load(*cache_id, min_ver='2025.03.31')
|
||||
if func_code:
|
||||
self._player_cache[cache_id] = func_code
|
||||
|
||||
return func_code
|
||||
|
||||
def _store_nsig_code_to_cache(self, player_id, func_code):
|
||||
cache_id = ('nsig code', player_id)
|
||||
def _store_nsig_code_to_cache(self, player_url, func_code):
|
||||
cache_id = ('youtube-nsig', self._player_js_cache_key(player_url))
|
||||
if cache_id not in self._player_cache:
|
||||
self.cache.store('youtube-nsig', player_id, func_code)
|
||||
self.cache.store(*cache_id, func_code)
|
||||
self._player_cache[cache_id] = func_code
|
||||
|
||||
def _decrypt_signature(self, s, video_id, player_url):
|
||||
@ -2144,7 +2179,7 @@ def _decrypt_nsig(self, s, video_id, player_url):
|
||||
|
||||
self.write_debug(f'Decrypted nsig {s} => {ret}')
|
||||
# Only cache nsig func JS code to disk if successful, and only once
|
||||
self._store_nsig_code_to_cache(player_id, func_code)
|
||||
self._store_nsig_code_to_cache(player_url, func_code)
|
||||
return ret
|
||||
|
||||
def _extract_n_function_name(self, jscode, player_url=None):
|
||||
@ -2263,7 +2298,7 @@ def _fixup_n_function_code(self, argnames, nsig_code, jscode, player_url):
|
||||
|
||||
def _extract_n_function_code(self, video_id, player_url):
|
||||
player_id = self._extract_player_info(player_url)
|
||||
func_code = self._load_nsig_code_from_cache(player_id)
|
||||
func_code = self._load_nsig_code_from_cache(player_url)
|
||||
jscode = func_code or self._load_player(video_id, player_url)
|
||||
jsi = JSInterpreter(jscode)
|
||||
|
||||
@ -3226,7 +3261,8 @@ def build_fragments(f):
|
||||
if player_url:
|
||||
self.report_warning(
|
||||
f'nsig extraction failed: Some formats may be missing\n'
|
||||
f' n = {query["n"][0]} ; player = {player_url}',
|
||||
f' n = {query["n"][0]} ; player = {player_url}\n'
|
||||
f' {bug_reports_message(before="")}',
|
||||
video_id=video_id, only_once=True)
|
||||
self.write_debug(e, only_once=True)
|
||||
else:
|
||||
@ -3244,7 +3280,7 @@ def build_fragments(f):
|
||||
is_damaged = try_call(lambda: format_duration < duration // 2)
|
||||
if is_damaged:
|
||||
self.report_warning(
|
||||
f'{video_id}: Some formats are possibly damaged. They will be deprioritized', only_once=True)
|
||||
'Some formats are possibly damaged. They will be deprioritized', video_id, only_once=True)
|
||||
|
||||
po_token = fmt.get(STREAMING_DATA_INITIAL_PO_TOKEN)
|
||||
|
||||
|
@ -500,7 +500,8 @@ def _alias_callback(option, opt_str, value, parser, opts, nargs):
|
||||
'youtube-dlc': ['all', '-no-youtube-channel-redirect', '-no-live-chat', '-playlist-match-filter', '-manifest-filesize-approx', '-allow-unsafe-ext', '-prefer-vp9-sort'],
|
||||
'2021': ['2022', 'no-certifi', 'filename-sanitization'],
|
||||
'2022': ['2023', 'no-external-downloader-progress', 'playlist-match-filter', 'prefer-legacy-http-handler', 'manifest-filesize-approx'],
|
||||
'2023': ['prefer-vp9-sort'],
|
||||
'2023': ['2024', 'prefer-vp9-sort'],
|
||||
'2024': [],
|
||||
},
|
||||
}, help=(
|
||||
'Options that can help keep compatibility with youtube-dl or youtube-dlc '
|
||||
|
@ -2044,7 +2044,7 @@ def url_or_none(url):
|
||||
if not url or not isinstance(url, str):
|
||||
return None
|
||||
url = url.strip()
|
||||
return url if re.match(r'(?:(?:https?|rt(?:m(?:pt?[es]?|fp)|sp[su]?)|mms|ftps?):)?//', url) else None
|
||||
return url if re.match(r'(?:(?:https?|rt(?:m(?:pt?[es]?|fp)|sp[su]?)|mms|ftps?|wss?):)?//', url) else None
|
||||
|
||||
|
||||
def strftime_or_none(timestamp, date_format='%Y%m%d', default=None):
|
||||
|
@ -1,8 +1,8 @@
|
||||
# Autogenerated by devscripts/update-version.py
|
||||
|
||||
__version__ = '2025.03.27'
|
||||
__version__ = '2025.03.31'
|
||||
|
||||
RELEASE_GIT_HEAD = '48be862b32648bff5b3e553e40fca4dcc6e88b28'
|
||||
RELEASE_GIT_HEAD = '5e457af57fae9645b1b8fa0ed689229c8fb9656b'
|
||||
|
||||
VARIANT = None
|
||||
|
||||
@ -12,4 +12,4 @@
|
||||
|
||||
ORIGIN = 'yt-dlp/yt-dlp'
|
||||
|
||||
_pkg_version = '2025.03.27'
|
||||
_pkg_version = '2025.03.31'
|
||||
|
Loading…
Reference in New Issue
Block a user