1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2025-11-19 16:05:14 +00:00

pull changes from remote master (#190)

* [scrippsnetworks] Add new extractor(closes #19857)(closes #22981)

* [teachable] Improve locked lessons detection (#23528)

* [teachable] Fail with error message if no video URL found

* [extractors] add missing import for ScrippsNetworksIE

* [brightcove] cache brightcove player policy keys

* [prosiebensat1] improve geo restriction handling(closes #23571)

* [soundcloud] automatically update client id on failing requests

* [spankbang] Fix extraction (closes #23307, closes #23423, closes #23444)

* [spankbang] Improve removed video detection (#23423)

* [brightcove] update policy key on failing requests

* [pornhub] Fix extraction and add support for m3u8 formats (closes #22749, closes #23082)

* [pornhub] Improve locked videos detection (closes #22449, closes #22780)

* [brightcove] invalidate policy key cache on failing requests

* [soundcloud] fix client id extraction for non fatal requests

* [ChangeLog] Actualize
[ci skip]

* [devscripts/create-github-release] Switch to using PAT for authentication

Basic authentication will be deprecated soon

* release 2020.01.01

* [redtube] Detect private videos (#23518)

* [vice] improve extraction(closes #23631)

* [devscripts/create-github-release] Remove unused import

* [wistia] improve format extraction and extract subtitles(closes #22590)

* [nrktv:seriebase] Fix extraction (closes #23625) (#23537)

* [discovery] fix anonymous token extraction(closes #23650)

* [scrippsnetworks] add support for www.discovery.com videos

* [scrippsnetworks] correct test case URL

* [dctp] fix format extraction(closes #23656)

* [pandatv] Remove extractor (#23630)

* [naver] improve extraction

- improve geo-restriction handling
- extract automatic captions
- extract uploader metadata
- extract VLive HLS formats

* [naver] improve metadata extraction

* [cloudflarestream] improve extraction

- add support for bytehighway.net domain
- add support for signed URLs
- extract thumbnail

* [cloudflarestream] import embed URL extraction

* [lego] fix extraction and extract subtitle(closes #23687)

* [safari] Fix kaltura session extraction (closes #23679) (#23670)

* [orf:fm4] Fix extraction (#23599)

* [orf:radio] Clean description and improve extraction

* [twitter] add support for promo_video_website cards(closes #23711)

* [vodplatform] add support for embed.kwikmotion.com domain

* [ndr:base:embed] Improve thumbnails extraction (closes #23731)

* [canvas] Add support for new API endpoint and update tests (closes #17680, closes #18629)

* [travis] Add flake8 job (#23720)

* [yourporn] Fix extraction (closes #21645, closes #22255, closes #23459)

* [ChangeLog] Actualize
[ci skip]

* release 2020.01.15

* [soundcloud] Restore previews extraction (closes #23739)

* [orf:tvthek] Improve geo restricted videos detection (closes #23741)

* [zype] improve extraction

- extract subtitles(closes #21258)
- support URLs with alternative keys/tokens(#21258)
- extract more metadata

* [americastestkitchen] fix extraction

* [nbc] add support for nbc multi network URLs(closes #23049)

* [ard] improve extraction(closes #23761)

- simplify extraction
- extract age limit and series
- bypass geo-restriction

* [ivi:compilation] Fix entries extraction (closes #23770)

* [24video] Add support for 24video.vip (closes #23753)

* [businessinsider] Fix jwplatform id extraction (closes #22929) (#22954)

* [ard] add a missing condition

* [azmedien] fix extraction(closes #23783)

* [voicerepublic] fix extraction

* [stretchinternet] fix extraction(closes #4319)

* [youtube] Fix sigfunc name extraction (closes #23819)

* [ChangeLog] Actualize
[ci skip]

* release 2020.01.24

* [soundcloud] imporve private playlist/set tracks extraction

https://github.com/ytdl-org/youtube-dl/issues/3707#issuecomment-577873539

* [svt] fix article extraction(closes #22897)(closes #22919)

* [svt] fix series extraction(closes #22297)

* [viewlift] improve extraction

- fix extraction(closes #23851)
- add add support for authentication
- add support for more domains

* [vimeo] fix album extraction(closes #23864)

* [tva] Relax _VALID_URL (closes #23903)

* [tv5mondeplus] Fix extraction (closes #23907, closes #23911)

* [twitch:stream] Lowercase channel id for stream request (closes #23917)

* [sportdeutschland] Update to new sportdeutschland API

They switched to SSL, but under a different host AND path...
Remove the old test cases because these videos have become unavailable.

* [popcorntimes] Add extractor (closes #23949)

* [thisoldhouse] fix extraction(closes #23951)

* [toggle] Add support for mewatch.sg (closes #23895) (#23930)

* [compat] Introduce compat_realpath (refs #23991)

* [update] Fix updating via symlinks (closes #23991)

* [nytimes] improve format sorting(closes #24010)

* [abc:iview] Support 720p (#22907) (#22921)

* [nova:embed] Fix extraction (closes #23672)

* [nova:embed] Improve (closes #23690)

* [nova] Improve extraction (refs #23690)

* [jpopsuki] Remove extractor (closes #23858)

* [YoutubeDL] Fix playlist entry indexing with --playlist-items (closes #10591, closes #10622)

* [test_YoutubeDL] Fix get_ids

* [test_YoutubeDL] Add tests for #10591 (closes #23873)

* [24video] Add support for porn.24video.net (closes #23779, closes #23784)

* [npr] Add support for streams (closes #24042)

* [ChangeLog] Actualize
[ci skip]

* release 2020.02.16

* [tv2dk:bornholm:play] Fix extraction (#24076)

* [imdb] Fix extraction (closes #23443)

* [wistia] Add support for multiple generic embeds (closes #8347, closes #11385)

* [teachable] Add support for multiple videos per lecture (closes #24101)

* [pornhd] Fix extraction (closes #24128)

* [options] Remove duplicate short option -v for --version (#24162)

* [extractor/common] Convert ISM manifest to unicode before processing on python 2 (#24152)

* [YoutubeDL] Force redirect URL to unicode on python 2

* Remove no longer needed compat_str around geturl

* [youjizz] Fix extraction (closes #24181)

* [test_subtitles] Remove obsolete test

* [zdf:channel] Fix tests

* [zapiks] Fix test

* [xtube] Fix metadata extraction (closes #21073, closes #22455)

* [xtube:user] Fix test

* [telecinco] Fix extraction (refs #24195)

* [telecinco] Add support for article opening videos

* [franceculture] Fix extraction (closes #24204)

* [xhamster] Fix extraction (closes #24205)

* [ChangeLog] Actualize
[ci skip]

* release 2020.03.01

* [vimeo] Fix subtitles URLs (#24209)

* [servus] Add support for new URL schema (closes #23475, closes #23583, closes #24142)

* [youtube:playlist] Fix tests (closes #23872) (#23885)

* [peertube] Improve extraction

* [peertube] Fix issues and improve extraction (closes #23657)

* [pornhub] Improve title extraction (closes #24184)

* [vimeo] fix showcase password protected video extraction(closes #24224)

* [youtube] Fix age-gated videos support without login (closes #24248)

* [youtube] Fix tests

* [ChangeLog] Actualize
[ci skip]

* release 2020.03.06

* [nhk] update API version(closes #24270)

* [youtube] Improve extraction in 429 error conditions (closes #24283)

* [youtube] Improve age-gated videos extraction in 429 error conditions (refs #24283)

* [youtube] Remove outdated code

Additional get_video_info requests don't seem to provide any extra itags any longer

* [README.md] Clarify 429 error

* [pornhub] Add support for pornhubpremium.com (#24288)

* [utils] Add support for cookies with spaces used instead of tabs

* [ChangeLog] Actualize
[ci skip]

* release 2020.03.08

* Revert "[utils] Add support for cookies with spaces used instead of tabs"

According to [1] TABs must be used as separators between fields.
Files produces by some tools with spaces as separators are considered
malformed.

1. https://curl.haxx.se/docs/http-cookies.html

This reverts commit cff99c91d1.

* [utils] Add reference to cookie file format

* Revert "[vimeo] fix showcase password protected video extraction(closes #24224)"

This reverts commit 12ee431676.

* [nhk] Relax _VALID_URL (#24329)

* [nhk] Remove obsolete rtmp formats (closes #24329)

* [nhk] Update m3u8 URL and use native hls (#24329)

* [ndr] Fix extraction (closes #24326)

* [xtube] Fix formats extraction (closes #24348)

* [xtube] Fix typo

* [hellporno] Fix extraction (closes #24399)

* [cbc:watch] Add support for authentication

* [cbc:watch] Fix authenticated device token caching (closes #19160)

* [soundcloud] fix download url extraction(closes #24394)

* [limelight] remove disabled API requests(closes #24255)

* [bilibili] Add support for new URL schema with BV ids (closes #24439, closes #24442)

* [bilibili] Add support for player.bilibili.com (closes #24402)

* [teachable] Extract chapter metadata (closes #24421)

* [generic] Look for teachable embeds before wistia

* [teachable] Update upskillcourses domain

New version does not use teachable platform any longer

* [teachable] Update gns3 domain

* [teachable] Update test

* [ChangeLog] Actualize
[ci skip]

* [ChangeLog] Actualize
[ci skip]

* release 2020.03.24

* [spankwire] Fix extraction (closes #18924, closes #20648)

* [spankwire] Add support for generic embeds (refs #24633)

* [youporn] Add support form generic embeds

* [mofosex] Add support for generic embeds (closes #24633)

* [tele5] Fix extraction (closes #24553)

* [extractor/common] Skip malformed ISM manifest XMLs while extracting ISM formats (#24667)

* [tv4] Fix ISM formats extraction (closes #24667)

* [twitch:clips] Extend _VALID_URL (closes #24290) (#24642)

* [motherless] Fix extraction (closes #24699)

* [nova:embed] Fix extraction (closes #24700)

* [youtube] Skip broken multifeed videos (closes #24711)

* [soundcloud] Extract AAC format

* [soundcloud] Improve AAC format extraction (closes #19173, closes #24708)

* [thisoldhouse] Fix video id extraction (closes #24548)

Added support for:
with of without "www."
and either  ".chorus.build" or ".com"

It now validated correctly on older URL's
```
<iframe src="https://thisoldhouse.chorus.build/videos/zype/5e33baec27d2e50001d5f52f
```
and newer ones
```
<iframe src="https://www.thisoldhouse.com/videos/zype/5e2b70e95216cc0001615120
```

* [thisoldhouse] Improve video id extraction (closes #24549)

* [youtube] Fix DRM videos detection (refs #24736)

* [options] Clarify doc on --exec command (closes #19087) (#24883)

* [prosiebensat1] Improve extraction and remove 7tv.de support (#24948)

* [prosiebensat1] Extract series metadata

* [tenplay] Relax _VALID_URL (closes #25001)

* [tvplay] fix Viafree extraction(closes #15189)(closes #24473)(closes #24789)

* [yahoo] fix GYAO Player extraction and relax title URL regex(closes #24178)(closes #24778)

* [youtube] Use redirected video id if any (closes #25063)

* [youtube] Improve player id extraction and add tests

* [extractor/common] Extract multiple JSON-LD entries

* [crunchyroll] Fix and improve extraction (closes #25096, closes #25060)

* [ChangeLog] Actualize
[ci skip]

* release 2020.05.03

* [puhutv] Remove no longer available HTTP formats (closes #25124)

* [utils] Improve cookie files support

+ Add support for UTF-8 in cookie files
* Skip malformed cookie file entries instead of crashing (invalid entry len, invalid expires at)

* [dailymotion] Fix typo

* [compat] Introduce compat_cookiejar_Cookie

* [extractor/common] Use compat_cookiejar_Cookie for _set_cookie (closes #23256, closes #24776)

To always ensure cookie name and value are bytestrings on python 2.

* [orf] Add support for more radio stations (closes #24938) (#24968)

* [uol] fix extraction(closes #22007)

* [downloader/http] Finish downloading once received data length matches expected

Always do this if possible, i.e. if Content-Length or expected length is known, not only in test.
This will save unnecessary last extra loop trying to read 0 bytes.

* [downloader/http] Request last data block of exact remaining size

Always request last data block of exact size remaining to download if possible not the current block size.

* [iprima] Improve extraction (closes #25138)

* [youtube] Improve signature cipher extraction (closes #25188)

* [ChangeLog] Actualize
[ci skip]

* release 2020.05.08

* [spike] fix Bellator mgid extraction(closes #25195)

* [bbccouk] PEP8

* [mailru] Fix extraction (closes #24530) (#25239)

* [README.md] flake8 HTTPS URL (#25230)

* [youtube] Add support for yewtu.be (#25226)

* [soundcloud] reduce API playlist page limit(closes #25274)

* [vimeo] improve format extraction and sorting(closes #25285)

* [redtube] Improve title extraction (#25208)

* [indavideo] Switch to HTTPS for API request (#25191)

* [utils] Fix file permissions in write_json_file (closes #12471) (#25122)

* [redtube] Improve formats extraction and extract m3u8 formats (closes #25311, closes #25321)

* [ard] Improve _VALID_URL (closes #25134) (#25198)

* [giantbomb] Extend _VALID_URL (#25222)

* [postprocessor/ffmpeg] Embed series metadata with --add-metadata

* [youtube] Add support for more invidious instances (#25417)

* [ard:beta] Extend _VALID_URL (closes #25405)

* [ChangeLog] Actualize
[ci skip]

* release 2020.05.29

* [jwplatform] Improve embeds extraction (closes #25467)

* [periscope] Fix untitled broadcasts (#25482)

* [twitter:broadcast] Add untitled periscope broadcast test

* [malltv] Add support for sk.mall.tv (#25445)

* [brightcove] Fix subtitles extraction (closes #25540)

* [brightcove] Sort imports

* [twitch] Pass v5 accept header and fix thumbnails extraction (closes #25531)

* [twitch:stream] Fix extraction (closes #25528)

* [twitch:stream] Expect 400 and 410 HTTP errors from API

* [tele5] Prefer jwplatform over nexx (closes #25533)

* [jwplatform] Add support for bypass geo restriction

* [tele5] Bypass geo restriction

* [ChangeLog] Actualize
[ci skip]

* release 2020.06.06

* [kaltura] Add support for multiple embeds on a webpage (closes #25523)

* [youtube] Extract chapters from JSON (closes #24819)

* [facebook] Support single-video ID links

I stumbled upon this at https://www.facebook.com/bwfbadminton/posts/10157127020046316 . No idea how prevalent it is yet.

* [youtube] Fix playlist and feed extraction (closes #25675)

* [youtube] Fix thumbnails extraction and remove uploader id extraction warning (closes #25676)

* [youtube] Fix upload date extraction

* [youtube] Improve view count extraction

* [youtube] Fix uploader id and uploader URL extraction

* [ChangeLog] Actualize
[ci skip]

* release 2020.06.16

* [youtube] Fix categories and improve tags extraction

* [youtube] Force old layout (closes #25682, closes #25683, closes #25680, closes #25686)

* [ChangeLog] Actualize
[ci skip]

* release 2020.06.16.1

* [brightcove] Improve embed detection (closes #25674)

* [bellmedia] add support for cp24.com clip URLs(closes #25764)

* [youtube:playlists] Extend _VALID_URL (closes #25810)

* [youtube] Prevent excess HTTP 301 (#25786)

* [wistia] Restrict embed regex (closes #25969)

* [youtube] Improve description extraction (closes #25937) (#25980)

* [youtube] Fix sigfunc name extraction (closes #26134, closes #26135, closes #26136, closes #26137)

* [ChangeLog] Actualize
[ci skip]

* release 2020.07.28

* [xhamster] Extend _VALID_URL (closes #25789) (#25804)

* [xhamster] Fix extraction (closes #26157) (#26254)

* [xhamster] Extend _VALID_URL (closes #25927)

Co-authored-by: Remita Amine <remitamine@gmail.com>
Co-authored-by: Sergey M․ <dstftw@gmail.com>
Co-authored-by: nmeum <soeren+github@soeren-tempel.net>
Co-authored-by: Roxedus <me@roxedus.dev>
Co-authored-by: Singwai Chan <c.singwai@gmail.com>
Co-authored-by: cdarlint <cdarlint@users.noreply.github.com>
Co-authored-by: Johannes N <31795504+jonolt@users.noreply.github.com>
Co-authored-by: jnozsc <jnozsc@gmail.com>
Co-authored-by: Moritz Patelscheck <moritz.patelscheck@campus.tu-berlin.de>
Co-authored-by: PB <3854688+uno20001@users.noreply.github.com>
Co-authored-by: Philipp Hagemeister <phihag@phihag.de>
Co-authored-by: Xaver Hellauer <software@hellauer.bayern>
Co-authored-by: d2au <d2au.dev@gmail.com>
Co-authored-by: Jan 'Yenda' Trmal <jtrmal@gmail.com>
Co-authored-by: jxu <7989982+jxu@users.noreply.github.com>
Co-authored-by: Martin Ström <name@my-domain.se>
Co-authored-by: The Hatsune Daishi <nao20010128@gmail.com>
Co-authored-by: tsia <github@tsia.de>
Co-authored-by: 3risian <59593325+3risian@users.noreply.github.com>
Co-authored-by: Tristan Waddington <tristan.waddington@gmail.com>
Co-authored-by: Devon Meunier <devon.meunier@gmail.com>
Co-authored-by: Felix Stupp <felix.stupp@outlook.com>
Co-authored-by: tom <tomster954@gmail.com>
Co-authored-by: AndrewMBL <62922222+AndrewMBL@users.noreply.github.com>
Co-authored-by: willbeaufoy <will@willbeaufoy.net>
Co-authored-by: Philipp Stehle <anderschwiedu@googlemail.com>
Co-authored-by: hh0rva1h <61889859+hh0rva1h@users.noreply.github.com>
Co-authored-by: comsomisha <shmelev1996@mail.ru>
Co-authored-by: TotalCaesar659 <14265316+TotalCaesar659@users.noreply.github.com>
Co-authored-by: Juan Francisco Cantero Hurtado <iam@juanfra.info>
Co-authored-by: Dave Loyall <dave@the-good-guys.net>
Co-authored-by: tlsssl <63866177+tlsssl@users.noreply.github.com>
Co-authored-by: Rob <ankenyr@gmail.com>
Co-authored-by: Michael Klein <github@a98shuttle.de>
Co-authored-by: JordanWeatherby <47519158+JordanWeatherby@users.noreply.github.com>
Co-authored-by: striker.sh <19488257+strikersh@users.noreply.github.com>
Co-authored-by: Matej Dujava <mdujava@gmail.com>
Co-authored-by: Glenn Slayden <5589855+glenn-slayden@users.noreply.github.com>
Co-authored-by: MRWITEK <mrvvitek@gmail.com>
Co-authored-by: JChris246 <43832407+JChris246@users.noreply.github.com>
Co-authored-by: TheRealDude2 <the.real.dude@gmx.de>
This commit is contained in:
Aakash Gajjar
2020-08-25 20:23:34 +05:30
committed by GitHub
parent 89cee32ce9
commit b827ee921f
134 changed files with 4146 additions and 2619 deletions

View File

@@ -29,7 +29,6 @@ from ..compat import (
from ..utils import (
bool_or_none,
clean_html,
dict_get,
error_to_compat_str,
extract_attributes,
ExtractorError,
@@ -71,9 +70,14 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
_PLAYLIST_ID_RE = r'(?:PL|LL|EC|UU|FL|RD|UL|TL|PU|OLAK5uy_)[0-9A-Za-z-_]{10,}'
_YOUTUBE_CLIENT_HEADERS = {
'x-youtube-client-name': '1',
'x-youtube-client-version': '1.20200609.04.02',
}
def _set_language(self):
self._set_cookie(
'.youtube.com', 'PREF', 'f1=50000000&hl=en',
'.youtube.com', 'PREF', 'f1=50000000&f6=8&hl=en',
# YouTube sets the expire time to about two months
expire_time=time.time() + 2 * 30 * 24 * 3600)
@@ -299,10 +303,11 @@ class YoutubeEntryListBaseInfoExtractor(YoutubeBaseInfoExtractor):
# Downloading page may result in intermittent 5xx HTTP error
# that is usually worked around with a retry
more = self._download_json(
'https://youtube.com/%s' % mobj.group('more'), playlist_id,
'https://www.youtube.com/%s' % mobj.group('more'), playlist_id,
'Downloading page #%s%s'
% (page_num, ' (retry #%d)' % count if count else ''),
transform_source=uppercase_escape)
transform_source=uppercase_escape,
headers=self._YOUTUBE_CLIENT_HEADERS)
break
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (500, 503):
@@ -389,8 +394,15 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
(?:www\.)?invidious\.drycat\.fr/|
(?:www\.)?tube\.poal\.co/|
(?:www\.)?vid\.wxzm\.sx/|
(?:www\.)?yewtu\.be/|
(?:www\.)?yt\.elukerio\.org/|
(?:www\.)?yt\.lelux\.fi/|
(?:www\.)?invidious\.ggc-project\.de/|
(?:www\.)?yt\.maisputain\.ovh/|
(?:www\.)?invidious\.13ad\.de/|
(?:www\.)?invidious\.toot\.koeln/|
(?:www\.)?invidious\.fdn\.fr/|
(?:www\.)?watch\.nettohikari\.com/|
(?:www\.)?kgg2m7yk5aybusll\.onion/|
(?:www\.)?qklhadlycap4cnod\.onion/|
(?:www\.)?axqzx4s6s54s32yentfqojs3x5i7faxza6xo3ehd4bzzsg2ii4fv2iid\.onion/|
@@ -398,6 +410,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
(?:www\.)?fz253lmuao3strwbfbmx46yu7acac2jz27iwtorgmbqlkurlclmancad\.onion/|
(?:www\.)?invidious\.l4qlywnpwqsluw65ts7md3khrivpirse744un3x7mlskqauz5pyuzgqd\.onion/|
(?:www\.)?owxfohz4kjyv25fvlqilyxast7inivgiktls3th44jhk3ej3i7ya\.b32\.i2p/|
(?:www\.)?4l2dgddgsrkf2ous66i6seeyi6etzfgrue332grh2n7madpwopotugyd\.onion/|
youtube\.googleapis\.com/) # the various hostnames, with wildcard subdomains
(?:.*?\#/)? # handle anchor (#/) redirect urls
(?: # the various things that can precede the ID:
@@ -427,6 +440,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
(?(1).+)? # if we found the ID, everything can follow
$""" % {'playlist_id': YoutubeBaseInfoExtractor._PLAYLIST_ID_RE}
_NEXT_URL_RE = r'[\?&]next_url=([^&]+)'
_PLAYER_INFO_RE = (
r'/(?P<id>[a-zA-Z0-9_-]{8,})/player_ias\.vflset(?:/[a-zA-Z]{2,3}_[a-zA-Z]{2,3})?/base\.(?P<ext>[a-z]+)$',
r'\b(?P<id>vfl[a-zA-Z0-9_-]+)\b.*?\.(?P<ext>[a-z]+)$',
)
_formats = {
'5': {'ext': 'flv', 'width': 400, 'height': 240, 'acodec': 'mp3', 'abr': 64, 'vcodec': 'h263'},
'6': {'ext': 'flv', 'width': 450, 'height': 270, 'acodec': 'mp3', 'abr': 64, 'vcodec': 'h263'},
@@ -570,7 +587,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'upload_date': '20120506',
'title': 'Icona Pop - I Love It (feat. Charli XCX) [OFFICIAL VIDEO]',
'alt_title': 'I Love It (feat. Charli XCX)',
'description': 'md5:f3ceb5ef83a08d95b9d146f973157cc8',
'description': 'md5:19a2f98d9032b9311e686ed039564f63',
'tags': ['Icona Pop i love it', 'sweden', 'pop music', 'big beat records', 'big beat', 'charli',
'xcx', 'charli xcx', 'girls', 'hbo', 'i love it', "i don't care", 'icona', 'pop',
'iconic ep', 'iconic', 'love', 'it'],
@@ -685,12 +702,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'id': 'nfWlot6h_JM',
'ext': 'm4a',
'title': 'Taylor Swift - Shake It Off',
'description': 'md5:bec2185232c05479482cb5a9b82719bf',
'description': 'md5:307195cd21ff7fa352270fe884570ef0',
'duration': 242,
'uploader': 'TaylorSwiftVEVO',
'uploader_id': 'TaylorSwiftVEVO',
'upload_date': '20140818',
'creator': 'Taylor Swift',
},
'params': {
'youtube_include_dash_manifest': True,
@@ -755,11 +771,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'upload_date': '20100430',
'uploader_id': 'deadmau5',
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/deadmau5',
'creator': 'deadmau5',
'creator': 'Dada Life, deadmau5',
'description': 'md5:12c56784b8032162bb936a5f76d55360',
'uploader': 'deadmau5',
'title': 'Deadmau5 - Some Chords (HD)',
'alt_title': 'Some Chords',
'alt_title': 'This Machine Kills Some Chords',
},
'expected_warnings': [
'DASH manifest missing',
@@ -1135,6 +1151,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'skip_download': True,
'youtube_include_dash_manifest': False,
},
'skip': 'not actual anymore',
},
{
# Youtube Music Auto-generated description
@@ -1145,8 +1162,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'title': 'Voyeur Girl',
'description': 'md5:7ae382a65843d6df2685993e90a8628f',
'upload_date': '20190312',
'uploader': 'Various Artists - Topic',
'uploader_id': 'UCVWKBi1ELZn0QX2CBLSkiyw',
'uploader': 'Stephen - Topic',
'uploader_id': 'UC-pWHpBjdGG69N9mM2auIAA',
'artist': 'Stephen',
'track': 'Voyeur Girl',
'album': 'it\'s too much love to know my dear',
@@ -1210,7 +1227,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'id': '-hcAI0g-f5M',
'ext': 'mp4',
'title': 'Put It On Me',
'description': 'md5:93c55acc682ae7b0c668f2e34e1c069e',
'description': 'md5:f6422397c07c4c907c6638e1fee380a5',
'upload_date': '20180426',
'uploader': 'Matt Maeson - Topic',
'uploader_id': 'UCnEkIGqtGcQMLk73Kp-Q5LQ',
@@ -1228,6 +1245,26 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'url': 'https://www.youtubekids.com/watch?v=3b8nCWDgZ6Q',
'only_matching': True,
},
{
# invalid -> valid video id redirection
'url': 'DJztXj2GPfl',
'info_dict': {
'id': 'DJztXj2GPfk',
'ext': 'mp4',
'title': 'Panjabi MC - Mundian To Bach Ke (The Dictator Soundtrack)',
'description': 'md5:bf577a41da97918e94fa9798d9228825',
'upload_date': '20090125',
'uploader': 'Prochorowka',
'uploader_id': 'Prochorowka',
'uploader_url': r're:https?://(?:www\.)?youtube\.com/user/Prochorowka',
'artist': 'Panjabi MC',
'track': 'Beware of the Boys (Mundian to Bach Ke) - Motivo Hi-Lectro Remix',
'album': 'Beware of the Boys (Mundian To Bach Ke)',
},
'params': {
'skip_download': True,
},
}
]
def __init__(self, *args, **kwargs):
@@ -1254,14 +1291,18 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
""" Return a string representation of a signature """
return '.'.join(compat_str(len(part)) for part in example_sig.split('.'))
def _extract_signature_function(self, video_id, player_url, example_sig):
id_m = re.match(
r'.*?-(?P<id>[a-zA-Z0-9_-]+)(?:/watch_as3|/html5player(?:-new)?|(?:/[a-z]{2,3}_[A-Z]{2})?/base)?\.(?P<ext>[a-z]+)$',
player_url)
if not id_m:
@classmethod
def _extract_player_info(cls, player_url):
for player_re in cls._PLAYER_INFO_RE:
id_m = re.search(player_re, player_url)
if id_m:
break
else:
raise ExtractorError('Cannot identify player %r' % player_url)
player_type = id_m.group('ext')
player_id = id_m.group('id')
return id_m.group('ext'), id_m.group('id')
def _extract_signature_function(self, video_id, player_url, example_sig):
player_type, player_id = self._extract_player_info(player_url)
# Read from filesystem cache
func_id = '%s_%s_%s' % (
@@ -1343,6 +1384,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
funcname = self._search_regex(
(r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
r'(?:\b|[^a-zA-Z0-9$])(?P<sig>[a-zA-Z0-9$]{2})\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
# Obsolete patterns
r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
@@ -1616,8 +1658,63 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
video_id = mobj.group(2)
return video_id
def _extract_chapters_from_json(self, webpage, video_id, duration):
if not webpage:
return
player = self._parse_json(
self._search_regex(
r'RELATED_PLAYER_ARGS["\']\s*:\s*({.+})\s*,?\s*\n', webpage,
'player args', default='{}'),
video_id, fatal=False)
if not player or not isinstance(player, dict):
return
watch_next_response = player.get('watch_next_response')
if not isinstance(watch_next_response, compat_str):
return
response = self._parse_json(watch_next_response, video_id, fatal=False)
if not response or not isinstance(response, dict):
return
chapters_list = try_get(
response,
lambda x: x['playerOverlays']
['playerOverlayRenderer']
['decoratedPlayerBarRenderer']
['decoratedPlayerBarRenderer']
['playerBar']
['chapteredPlayerBarRenderer']
['chapters'],
list)
if not chapters_list:
return
def chapter_time(chapter):
return float_or_none(
try_get(
chapter,
lambda x: x['chapterRenderer']['timeRangeStartMillis'],
int),
scale=1000)
chapters = []
for next_num, chapter in enumerate(chapters_list, start=1):
start_time = chapter_time(chapter)
if start_time is None:
continue
end_time = (chapter_time(chapters_list[next_num])
if next_num < len(chapters_list) else duration)
if end_time is None:
continue
title = try_get(
chapter, lambda x: x['chapterRenderer']['title']['simpleText'],
compat_str)
chapters.append({
'start_time': start_time,
'end_time': end_time,
'title': title,
})
return chapters
@staticmethod
def _extract_chapters(description, duration):
def _extract_chapters_from_description(description, duration):
if not description:
return None
chapter_lines = re.findall(
@@ -1651,6 +1748,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
})
return chapters
def _extract_chapters(self, webpage, description, video_id, duration):
return (self._extract_chapters_from_json(webpage, video_id, duration)
or self._extract_chapters_from_description(description, duration))
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
@@ -1678,7 +1779,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# Get video webpage
url = proto + '://www.youtube.com/watch?v=%s&gl=US&hl=en&has_verified=1&bpctr=9999999999' % video_id
video_webpage = self._download_webpage(url, video_id)
video_webpage, urlh = self._download_webpage_handle(url, video_id)
qs = compat_parse_qs(compat_urllib_parse_urlparse(urlh.geturl()).query)
video_id = qs.get('v', [None])[0] or video_id
# Attempt to extract SWF player URL
mobj = re.search(r'swfConfig.*?"(https?:\\/\\/.*?watch.*?-.*?\.swf)"', video_webpage)
@@ -1707,9 +1811,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def extract_view_count(v_info):
return int_or_none(try_get(v_info, lambda x: x['view_count'][0]))
def extract_token(v_info):
return dict_get(v_info, ('account_playback_token', 'accountPlaybackToken', 'token'))
def extract_player_response(player_response, video_id):
pl_response = str_or_none(player_response)
if not pl_response:
@@ -1722,6 +1823,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
player_response = {}
# Get video info
video_info = {}
embed_webpage = None
if re.search(r'player-age-gate-content">', video_webpage) is not None:
age_gate = True
@@ -1736,19 +1838,21 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
r'"sts"\s*:\s*(\d+)', embed_webpage, 'sts', default=''),
})
video_info_url = proto + '://www.youtube.com/get_video_info?' + data
video_info_webpage = self._download_webpage(
video_info_url, video_id,
note='Refetching age-gated info webpage',
errnote='unable to download video info webpage')
video_info = compat_parse_qs(video_info_webpage)
pl_response = video_info.get('player_response', [None])[0]
player_response = extract_player_response(pl_response, video_id)
add_dash_mpd(video_info)
view_count = extract_view_count(video_info)
try:
video_info_webpage = self._download_webpage(
video_info_url, video_id,
note='Refetching age-gated info webpage',
errnote='unable to download video info webpage')
except ExtractorError:
video_info_webpage = None
if video_info_webpage:
video_info = compat_parse_qs(video_info_webpage)
pl_response = video_info.get('player_response', [None])[0]
player_response = extract_player_response(pl_response, video_id)
add_dash_mpd(video_info)
view_count = extract_view_count(video_info)
else:
age_gate = False
video_info = None
sts = None
# Try looking directly into the video webpage
ytplayer_config = self._get_ytplayer_config(video_id, video_webpage)
if ytplayer_config:
@@ -1765,61 +1869,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
args['ypc_vid'], YoutubeIE.ie_key(), video_id=args['ypc_vid'])
if args.get('livestream') == '1' or args.get('live_playback') == 1:
is_live = True
sts = ytplayer_config.get('sts')
if not player_response:
player_response = extract_player_response(args.get('player_response'), video_id)
if not video_info or self._downloader.params.get('youtube_include_dash_manifest', True):
add_dash_mpd_pr(player_response)
# We also try looking in get_video_info since it may contain different dashmpd
# URL that points to a DASH manifest with possibly different itag set (some itags
# are missing from DASH manifest pointed by webpage's dashmpd, some - from DASH
# manifest pointed by get_video_info's dashmpd).
# The general idea is to take a union of itags of both DASH manifests (for example
# video with such 'manifest behavior' see https://github.com/ytdl-org/youtube-dl/issues/6093)
self.report_video_info_webpage_download(video_id)
for el in ('embedded', 'detailpage', 'vevo', ''):
query = {
'video_id': video_id,
'ps': 'default',
'eurl': '',
'gl': 'US',
'hl': 'en',
}
if el:
query['el'] = el
if sts:
query['sts'] = sts
video_info_webpage = self._download_webpage(
'%s://www.youtube.com/get_video_info' % proto,
video_id, note=False,
errnote='unable to download video info webpage',
fatal=False, query=query)
if not video_info_webpage:
continue
get_video_info = compat_parse_qs(video_info_webpage)
if not player_response:
pl_response = get_video_info.get('player_response', [None])[0]
player_response = extract_player_response(pl_response, video_id)
add_dash_mpd(get_video_info)
if view_count is None:
view_count = extract_view_count(get_video_info)
if not video_info:
video_info = get_video_info
get_token = extract_token(get_video_info)
if get_token:
# Different get_video_info requests may report different results, e.g.
# some may report video unavailability, but some may serve it without
# any complaint (see https://github.com/ytdl-org/youtube-dl/issues/7362,
# the original webpage as well as el=info and el=embedded get_video_info
# requests report video unavailability due to geo restriction while
# el=detailpage succeeds and returns valid data). This is probably
# due to YouTube measures against IP ranges of hosting providers.
# Working around by preferring the first succeeded video_info containing
# the token if no such video_info yet was found.
token = extract_token(video_info)
if not token:
video_info = get_video_info
break
def extract_unavailable_message():
messages = []
@@ -1832,16 +1885,22 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if messages:
return '\n'.join(messages)
if not video_info:
if not video_info and not player_response:
unavailable_message = extract_unavailable_message()
if not unavailable_message:
unavailable_message = 'Unable to extract video data'
raise ExtractorError(
'YouTube said: %s' % unavailable_message, expected=True, video_id=video_id)
if not isinstance(video_info, dict):
video_info = {}
video_details = try_get(
player_response, lambda x: x['videoDetails'], dict) or {}
microformat = try_get(
player_response, lambda x: x['microformat']['playerMicroformatRenderer'], dict) or {}
video_title = video_info.get('title', [None])[0] or video_details.get('title')
if not video_title:
self._downloader.report_warning('Unable to extract video title')
@@ -1871,7 +1930,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
''', replace_url, video_description)
video_description = clean_html(video_description)
else:
video_description = self._html_search_meta('description', video_webpage) or video_details.get('shortDescription')
video_description = video_details.get('shortDescription') or self._html_search_meta('description', video_webpage)
if not smuggled_data.get('force_singlefeed', False):
if not self._downloader.params.get('noplaylist'):
@@ -1888,15 +1947,26 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# fields may contain comma as well (see
# https://github.com/ytdl-org/youtube-dl/issues/8536)
feed_data = compat_parse_qs(compat_urllib_parse_unquote_plus(feed))
def feed_entry(name):
return try_get(feed_data, lambda x: x[name][0], compat_str)
feed_id = feed_entry('id')
if not feed_id:
continue
feed_title = feed_entry('title')
title = video_title
if feed_title:
title += ' (%s)' % feed_title
entries.append({
'_type': 'url_transparent',
'ie_key': 'Youtube',
'url': smuggle_url(
'%s://www.youtube.com/watch?v=%s' % (proto, feed_data['id'][0]),
{'force_singlefeed': True}),
'title': '%s (%s)' % (video_title, feed_data['title'][0]),
'title': title,
})
feed_ids.append(feed_data['id'][0])
feed_ids.append(feed_id)
self.to_screen(
'Downloading multifeed video (%s) - add --no-playlist to just download video %s'
% (', '.join(feed_ids), video_id))
@@ -1908,6 +1978,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
view_count = extract_view_count(video_info)
if view_count is None and video_details:
view_count = int_or_none(video_details.get('viewCount'))
if view_count is None and microformat:
view_count = int_or_none(microformat.get('viewCount'))
if is_live is None:
is_live = bool_or_none(video_details.get('isLive'))
@@ -1967,12 +2039,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
}
for fmt in streaming_formats:
if fmt.get('drm_families'):
if fmt.get('drmFamilies') or fmt.get('drm_families'):
continue
url = url_or_none(fmt.get('url'))
if not url:
cipher = fmt.get('cipher')
cipher = fmt.get('cipher') or fmt.get('signatureCipher')
if not cipher:
continue
url_data = compat_parse_qs(cipher)
@@ -2023,22 +2095,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if self._downloader.params.get('verbose'):
if player_url is None:
player_version = 'unknown'
player_desc = 'unknown'
else:
if player_url.endswith('swf'):
player_version = self._search_regex(
r'-(.+?)(?:/watch_as3)?\.swf$', player_url,
'flash player', fatal=False)
player_desc = 'flash player %s' % player_version
else:
player_version = self._search_regex(
[r'html5player-([^/]+?)(?:/html5player(?:-new)?)?\.js',
r'(?:www|player(?:_ias)?)-([^/]+)(?:/[a-z]{2,3}_[A-Z]{2})?/base\.js'],
player_url,
'html5 player', fatal=False)
player_desc = 'html5 player %s' % player_version
player_type, player_version = self._extract_player_info(player_url)
player_desc = '%s player %s' % ('flash' if player_type == 'swf' else 'html5', player_version)
parts_sizes = self._signature_cache_id(encrypted_sig)
self.to_screen('{%s} signature length %s, %s' %
(format_id, parts_sizes, player_desc))
@@ -2171,7 +2231,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
video_uploader_id = mobj.group('uploader_id')
video_uploader_url = mobj.group('uploader_url')
else:
self._downloader.report_warning('unable to extract uploader nickname')
owner_profile_url = url_or_none(microformat.get('ownerProfileUrl'))
if owner_profile_url:
video_uploader_id = self._search_regex(
r'(?:user|channel)/([^/]+)', owner_profile_url, 'uploader id',
default=None)
video_uploader_url = owner_profile_url
channel_id = (
str_or_none(video_details.get('channelId'))
@@ -2182,17 +2247,33 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
video_webpage, 'channel id', default=None, group='id'))
channel_url = 'http://www.youtube.com/channel/%s' % channel_id if channel_id else None
# thumbnail image
# We try first to get a high quality image:
m_thumb = re.search(r'<span itemprop="thumbnail".*?href="(.*?)">',
video_webpage, re.DOTALL)
if m_thumb is not None:
video_thumbnail = m_thumb.group(1)
elif 'thumbnail_url' not in video_info:
self._downloader.report_warning('unable to extract video thumbnail')
thumbnails = []
thumbnails_list = try_get(
video_details, lambda x: x['thumbnail']['thumbnails'], list) or []
for t in thumbnails_list:
if not isinstance(t, dict):
continue
thumbnail_url = url_or_none(t.get('url'))
if not thumbnail_url:
continue
thumbnails.append({
'url': thumbnail_url,
'width': int_or_none(t.get('width')),
'height': int_or_none(t.get('height')),
})
if not thumbnails:
video_thumbnail = None
else: # don't panic if we can't find it
video_thumbnail = compat_urllib_parse_unquote_plus(video_info['thumbnail_url'][0])
# We try first to get a high quality image:
m_thumb = re.search(r'<span itemprop="thumbnail".*?href="(.*?)">',
video_webpage, re.DOTALL)
if m_thumb is not None:
video_thumbnail = m_thumb.group(1)
thumbnail_url = try_get(video_info, lambda x: x['thumbnail_url'][0], compat_str)
if thumbnail_url:
video_thumbnail = compat_urllib_parse_unquote_plus(thumbnail_url)
if video_thumbnail:
thumbnails.append({'url': video_thumbnail})
# upload date
upload_date = self._html_search_meta(
@@ -2202,6 +2283,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
[r'(?s)id="eow-date.*?>(.*?)</span>',
r'(?:id="watch-uploader-info".*?>.*?|["\']simpleText["\']\s*:\s*["\'])(?:Published|Uploaded|Streamed live|Started) on (.+?)[<"\']'],
video_webpage, 'upload date', default=None)
if not upload_date:
upload_date = microformat.get('publishDate') or microformat.get('uploadDate')
upload_date = unified_strdate(upload_date)
video_license = self._html_search_regex(
@@ -2273,17 +2356,21 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
m_cat_container = self._search_regex(
r'(?s)<h4[^>]*>\s*Category\s*</h4>\s*<ul[^>]*>(.*?)</ul>',
video_webpage, 'categories', default=None)
category = None
if m_cat_container:
category = self._html_search_regex(
r'(?s)<a[^<]+>(.*?)</a>', m_cat_container, 'category',
default=None)
video_categories = None if category is None else [category]
else:
video_categories = None
if not category:
category = try_get(
microformat, lambda x: x['category'], compat_str)
video_categories = None if category is None else [category]
video_tags = [
unescapeHTML(m.group('content'))
for m in re.finditer(self._meta_regex('og:video:tag'), video_webpage)]
if not video_tags:
video_tags = try_get(video_details, lambda x: x['keywords'], list)
def _extract_count(count_name):
return str_to_int(self._search_regex(
@@ -2334,7 +2421,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
errnote='Unable to download video annotations', fatal=False,
data=urlencode_postdata({xsrf_field_name: xsrf_token}))
chapters = self._extract_chapters(description_original, video_duration)
chapters = self._extract_chapters(video_webpage, description_original, video_id, video_duration)
# Look for the DASH manifest
if self._downloader.params.get('youtube_include_dash_manifest', True):
@@ -2391,30 +2478,23 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
f['stretched_ratio'] = ratio
if not formats:
token = extract_token(video_info)
if not token:
if 'reason' in video_info:
if 'The uploader has not made this video available in your country.' in video_info['reason']:
regions_allowed = self._html_search_meta(
'regionsAllowed', video_webpage, default=None)
countries = regions_allowed.split(',') if regions_allowed else None
self.raise_geo_restricted(
msg=video_info['reason'][0], countries=countries)
reason = video_info['reason'][0]
if 'Invalid parameters' in reason:
unavailable_message = extract_unavailable_message()
if unavailable_message:
reason = unavailable_message
raise ExtractorError(
'YouTube said: %s' % reason,
expected=True, video_id=video_id)
else:
raise ExtractorError(
'"token" parameter not in video info for unknown reason',
video_id=video_id)
if not formats and (video_info.get('license_info') or try_get(player_response, lambda x: x['streamingData']['licenseInfos'])):
raise ExtractorError('This video is DRM protected.', expected=True)
if 'reason' in video_info:
if 'The uploader has not made this video available in your country.' in video_info['reason']:
regions_allowed = self._html_search_meta(
'regionsAllowed', video_webpage, default=None)
countries = regions_allowed.split(',') if regions_allowed else None
self.raise_geo_restricted(
msg=video_info['reason'][0], countries=countries)
reason = video_info['reason'][0]
if 'Invalid parameters' in reason:
unavailable_message = extract_unavailable_message()
if unavailable_message:
reason = unavailable_message
raise ExtractorError(
'YouTube said: %s' % reason,
expected=True, video_id=video_id)
if video_info.get('license_info') or try_get(player_response, lambda x: x['streamingData']['licenseInfos']):
raise ExtractorError('This video is DRM protected.', expected=True)
self._sort_formats(formats)
@@ -2432,7 +2512,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'creator': video_creator or artist,
'title': video_title,
'alt_title': video_alt_title or track,
'thumbnail': video_thumbnail,
'thumbnails': thumbnails,
'description': video_description,
'categories': video_categories,
'tags': video_tags,
@@ -2494,20 +2574,23 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
_VIDEO_RE = _VIDEO_RE_TPL % r'(?P<id>[0-9A-Za-z_-]{11})'
IE_NAME = 'youtube:playlist'
_TESTS = [{
'url': 'https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re',
'url': 'https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc',
'info_dict': {
'title': 'ytdl test PL',
'id': 'PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re',
'uploader_id': 'UCmlqkdCBesrv2Lak1mF_MxA',
'uploader': 'Sergey M.',
'id': 'PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc',
'title': 'youtube-dl public playlist',
},
'playlist_count': 3,
'playlist_count': 1,
}, {
'url': 'https://www.youtube.com/playlist?list=PLtPgu7CB4gbZDA7i_euNxn75ISqxwZPYx',
'url': 'https://www.youtube.com/playlist?list=PL4lCao7KL_QFodcLWhDpGCYnngnHtQ-Xf',
'info_dict': {
'id': 'PLtPgu7CB4gbZDA7i_euNxn75ISqxwZPYx',
'title': 'YDL_Empty_List',
'uploader_id': 'UCmlqkdCBesrv2Lak1mF_MxA',
'uploader': 'Sergey M.',
'id': 'PL4lCao7KL_QFodcLWhDpGCYnngnHtQ-Xf',
'title': 'youtube-dl empty playlist',
},
'playlist_count': 0,
'skip': 'This playlist is private',
}, {
'note': 'Playlist with deleted videos (#651). As a bonus, the video #51 is also twice in this list.',
'url': 'https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC',
@@ -2517,7 +2600,7 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
'uploader': 'Christiaan008',
'uploader_id': 'ChRiStIaAn008',
},
'playlist_count': 95,
'playlist_count': 96,
}, {
'note': 'issue #673',
'url': 'PLBB231211A4F62143',
@@ -2693,7 +2776,7 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
ids = []
last_id = playlist_id[-11:]
for n in itertools.count(1):
url = 'https://youtube.com/watch?v=%s&list=%s' % (last_id, playlist_id)
url = 'https://www.youtube.com/watch?v=%s&list=%s' % (last_id, playlist_id)
webpage = self._download_webpage(
url, playlist_id, 'Downloading page {0} of Youtube mix'.format(n))
new_ids = orderedSet(re.findall(
@@ -3033,7 +3116,7 @@ class YoutubeLiveIE(YoutubeBaseInfoExtractor):
class YoutubePlaylistsIE(YoutubePlaylistsBaseInfoExtractor):
IE_DESC = 'YouTube.com user/channel playlists'
_VALID_URL = r'https?://(?:\w+\.)?youtube\.com/(?:user|channel)/(?P<id>[^/]+)/playlists'
_VALID_URL = r'https?://(?:\w+\.)?youtube\.com/(?:user|channel|c)/(?P<id>[^/]+)/playlists'
IE_NAME = 'youtube:playlists'
_TESTS = [{
@@ -3059,6 +3142,9 @@ class YoutubePlaylistsIE(YoutubePlaylistsBaseInfoExtractor):
'title': 'Chem Player',
},
'skip': 'Blocked',
}, {
'url': 'https://www.youtube.com/c/ChristophLaimer/playlists',
'only_matching': True,
}]
@@ -3203,9 +3289,10 @@ class YoutubeFeedsInfoExtractor(YoutubeBaseInfoExtractor):
break
more = self._download_json(
'https://youtube.com/%s' % mobj.group('more'), self._PLAYLIST_TITLE,
'https://www.youtube.com/%s' % mobj.group('more'), self._PLAYLIST_TITLE,
'Downloading page #%s' % page_num,
transform_source=uppercase_escape)
transform_source=uppercase_escape,
headers=self._YOUTUBE_CLIENT_HEADERS)
content_html = more['content_html']
more_widget_html = more['load_more_widget_html']