1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2026-01-11 17:31:31 +00:00

Compare commits

..

84 Commits

Author SHA1 Message Date
pukkandan
fac988053f Release 2021.05.11
* and some documentation improvements
2021-05-11 13:35:05 +05:30
pukkandan
61241abbb0 [generic] Respect the encoding in manifest 2021-05-11 13:32:03 +05:30
pukkandan
53ed7066ab Option --compat-options to revert some of yt-dlp's changes
* Deprecates `--list-formats-as-table`, `--list-formats-old`
2021-05-11 13:30:48 +05:30
pukkandan
a61f4b287b Deprecate support for python versions < 3.6
Closes #267
2021-05-09 04:32:23 +05:30
pukkandan
486fb17975 Remove -l, -t, -A completely and disable --auto-number, --title, --literal, --id 2021-05-09 04:22:29 +05:30
pukkandan
2f567473c6 [Plugins] Prioritize plugins over standard extractors
and prevent plugins from overwriting the standard extractor classes

Closes #304
2021-05-09 04:22:27 +05:30
pukkandan
000ee7ef34 [fragment] Make sure first segment is not skipped 2021-05-09 04:22:26 +05:30
pukkandan
41d1cca328 Update to ytdl-commit-a726009
[blinkx] Remove extractor
a726009987
2021-05-06 21:31:20 +05:30
pukkandan
717297545b Fix playlist_index and add playlist_autonumber (#302)
Now `playlist_index` is always the position of the video in the actual playlist and `playlist_autonumber` is the position of the item in the playlist queue
2021-05-06 20:56:19 +05:30
pukkandan
e8e738406a Add experimental option --check-formats to test the URLs before format selection 2021-05-06 20:50:44 +05:30
pukkandan
e625be0d10 Improve output template internal formatting
* Allow slicing lists/strings using `field.start:end:step`
* A field can also be used as offset like `field1+num+field2`
* A default value can be given using `field|default`
* Capture all format strings and set it to `None` if invalid. This prevents invalid fields from causing errors
2021-05-06 20:28:58 +05:30
pukkandan
12e73423f1 [plutotv] Fix format extraction for some urls
* And fallback to the first urls if ad-free urls can't be found
Closes #299
2021-05-06 20:28:57 +05:30
pukkandan
7700b37f39 [plutotv] Extract subtitles from manifests 2021-05-06 20:28:56 +05:30
Ashish
c28cfda81f [SonyLiv] Fix title and series extraction (#301)
Authored by: Ashish0804
2021-05-06 20:27:43 +05:30
pukkandan
848887eb7a [downloader] Fix quiet and to_stderr 2021-05-04 22:38:10 +05:30
pukkandan
3158150cb7 [utils] Add network_exceptions 2021-05-04 22:36:18 +05:30
pukkandan
6ef6bcbd6b [fragment] Ensure the file is closed on error 2021-05-04 22:27:44 +05:30
pukkandan
06425e9621 [blinkx] Minor fix
Fixes: https://github.com/ytdl-org/youtube-dl/issues/28941
2021-05-04 22:27:44 +05:30
pukkandan
4d224a3022 [embedthumbnail] Fix bug where jpeg thumbnails were converted again
Closes #297
2021-05-04 22:18:40 +05:30
pukkandan
f59ae58163 Fix number of digits in %(playlist_index)s
When used with `--playlist-(items|start|end)`, the number of digits should depend on the last index in the playlist, not number of items
2021-05-03 22:49:05 +05:30
pukkandan
0d1bb027aa Move option warnings to YoutubeDL
Previously, these warnings did not obey `--no-warnings` and did not output colors
2021-05-03 22:49:04 +05:30
pukkandan
4cd0a709aa Fix preload_download_archive writing verbose message to stdout
* And move it after all deprecated warnings
2021-05-03 22:49:03 +05:30
pukkandan
1815d1028b [zee5] Fix py2 compatibility 2021-05-03 22:49:03 +05:30
The Hatsune Daishi
0fa9a1e236 [whowatch] Add extractor #292
closes #223

Authored by: nao20010128nao 
Modified from: 9e4a0e061a/youtube_dl/extractor/whowatch.py
2021-05-02 19:43:37 +05:30
pukkandan
eb55bad5a0 [aria2c] Fix whitespace being stripped off
Closes #276
2021-05-02 14:03:13 +05:30
pukkandan
cc0ec3e161 Do not strip out whitespaces in -o and -P
Related: https://github.com/yt-dlp/yt-dlp/issues/276#issuecomment-827361652
2021-05-02 14:03:12 +05:30
pukkandan
80185155a1 [ukcolumn] Add Extractor
Closes #287
2021-05-02 13:57:50 +05:30
pukkandan
c755f1901f [CBS] Improve _VALID_URL to support movies
Closes #290
Tested by: BeeMuffins
2021-05-01 21:32:14 +05:30
pukkandan
68b91dc905 [youtube] Add oembed to reserved names 2021-05-01 21:24:31 +05:30
pukkandan
88f06afc0c [rmcdecouverte] Improve _VALID_URL
Closes #291
2021-05-01 21:24:31 +05:30
CXwudi
40078a55e2 [niconico] Fix bug in thumbnail extraction #289
Bug from: 6b1d8c1e30
Authored by: CXwudi
2021-05-01 19:35:47 +05:30
pukkandan
d2558234cf [utils] Escape URL while sanitizing
Closes #263

While this fixes the issue in question, it does not try to address the root-cause of the problem
Refer: 915f911e36, f5fa042c82
2021-04-29 05:20:50 +05:30
pukkandan
f5fa042c82 Revert "[utils] Encode URLs in YoutubeDLCookieProcessor"
This reverts commit 915f911e36.

When the request is copied, `unredirected_hdrs` are not copied, which causes issues elsewhere
Reopens #263
2021-04-29 05:20:18 +05:30
pukkandan
07e4a40a9a [crackle] Improve extraction (See desc)
Closes #282

* Refactor authorization as an extension to `_download_json`
* Better error messages and warnings
* Respect `--ignore-no-formats-error`
* Extract subtitles from manifests
* Try with crackle's geo-location service if all hard-coded countries fail
2021-04-29 05:20:16 +05:30
pukkandan
e28f1c0ae8 [cleanup] Fix linter and some typos
* Also remove inconsistent use of `"` in setup.py
2021-04-28 19:59:40 +05:30
pukkandan
ef39f8600a [curiositystream] Fix collections
Closes #277

* A bug with authentication was reported in <https://github.com/yt-dlp/yt-dlp/issues/277#issuecomment-828254721> but cannot be tested without an account
2021-04-28 19:29:33 +05:30
pukkandan
2291dbce2a [niconico] Fix HLS formats
Closes #171

* The structure of the API JSON was changed
* Smile Video seems to be no longer available. So remove the warning
* Move ping to downloader
* Change heartbeat interval to 40sec
* Remove unnecessary API headers

Authored-by: CXwudi, tsukumijima, nao20010128nao, pukkandan
Tested by: tsukumijima
2021-04-28 19:18:29 +05:30
pukkandan
58f197b76c Revert "[core] be able to hand over id and title using url_result"
This reverts commit 0704d2224b.

This is a commit from `youtube-dlc`. It is not clear what the original purpose of this was. It seems to be a way for extractors to pass `title` and `id` through when the entry is processed by another extractor

* But `title` can already be passed through using `url_transparent`
* `id` is never supposed to be passed through since it could cause issues with archiving
2021-04-28 19:18:06 +05:30
pukkandan
895b0931e5 [youtube:tab] Detect playlists inside community posts 2021-04-28 19:18:06 +05:30
pukkandan
1ad047d0f7 [nebula] Move to nebula.app
Closes #272
Tested by: Lamieur
2021-04-28 19:18:06 +05:30
pukkandan
be6202f12b Subtitle extraction from streaming media manifests #247
Authored by fstirlitz
Modified from: https://github.com/ytdl-org/youtube-dl/pull/6144

Closes: #73
Fixes:
https://github.com/ytdl-org/youtube-dl/issues/6106
https://github.com/ytdl-org/youtube-dl/issues/14977
https://github.com/ytdl-org/youtube-dl/issues/21438
https://github.com/ytdl-org/youtube-dl/issues/23609
https://github.com/ytdl-org/youtube-dl/issues/28132

Might also fix (untested):
https://github.com/ytdl-org/youtube-dl/issues/15424
https://github.com/ytdl-org/youtube-dl/issues/18267
https://github.com/ytdl-org/youtube-dl/issues/23899
https://github.com/ytdl-org/youtube-dl/issues/24375
https://github.com/ytdl-org/youtube-dl/issues/24595
https://github.com/ytdl-org/youtube-dl/issues/27899

Related:
https://github.com/ytdl-org/youtube-dl/issues/22379
https://github.com/ytdl-org/youtube-dl/pull/24517
https://github.com/ytdl-org/youtube-dl/pull/24886
https://github.com/ytdl-org/youtube-dl/pull/27215

Notes:
* The functions `extractor.common._extract_..._formats` are still kept for compatibility
* Only some extractors have currently been moved to using `_extract_..._formats_and_subtitles`
* Direct subtitle manifests (without a master) are not supported and are wrongly identified as containing video formats
* AES support is untested
* The fragmented TTML subtitles extracted from DASH/ISM are valid, but are unsupported by `ffmpeg` and most video players
    * Their XML fragments can be dumped using `ffmpeg -i in.mp4 -f data -map 0 -c copy out.ttml`.
        Once the unnecessary headers are stripped out of this, it becomes a valid self-contained ttml file
    * The ttml subs downloaded from DASH manifests can also be directly opened with <https://github.com/SubtitleEdit>
* Fragmented WebVTT files extracted from DASH/ISM are also unsupported by most tools
    * Unlike the ttml files, the XML fragments of these cannot be dumped using `ffmpeg`
    * The webtt subs extracted from DASH can be parsed by <https://github.com/gpac/gpac>
    * But validity of the those extracted from ISM are untested
2021-04-28 19:02:43 +05:30
Felix S
e8f834cd8d [threeqsdn] Extract subtitles from streaming manifests 2021-04-28 17:24:50 +05:30
Felix S
e0e624ca7f [canvas] Extract subtitles from streaming manifests 2021-04-28 17:24:19 +05:30
Felix S
ec4f374c05 [wat] Extract subtitles from streaming manifests 2021-04-28 17:24:08 +05:30
Felix S
c811e8d8bd [atresplayer] Extract subtitles from streaming manifests 2021-04-28 17:23:56 +05:30
Felix S
b2cd5da460 [francetv] Extract subtitles from the HLS manifest 2021-04-28 17:23:47 +05:30
Felix S
2de3b21e05 [uplynk] Extract subtitles from HLS manifests 2021-04-28 17:23:37 +05:30
Felix S
4bed436371 [twitter] Extract subtitles from HLS manifests 2021-04-28 17:23:27 +05:30
Felix S
efe9dba595 [srgssr] Extract subtitles from HLS manifests 2021-04-28 17:23:16 +05:30
Felix S
47f4203dd3 [nytimes] Extract subtitles from HLS manifests 2021-04-28 17:23:05 +05:30
Felix S
015c10aeec [roosterteeth] Use common code for subtitle extraction 2021-04-28 17:22:56 +05:30
Felix S
a00d781b73 [elonet] Use common code for subtitle extraction 2021-04-28 17:22:45 +05:30
Felix S
0c541b563f [tv4] Extract subtitles from streaming manifests 2021-04-28 17:22:36 +05:30
Felix S
64a5cf7929 [byutv] Extract subtitles from streaming manifests 2021-04-28 17:22:27 +05:30
Felix S
7a450a3b1c [generic] Extract subtitles from direct SSTR manifest links 2021-04-28 17:22:18 +05:30
Felix S
7de27caf16 [generic] Extract subtitles from direct DASH manifest links 2021-04-28 17:22:07 +05:30
Felix S
c26326c1be [generic] Extract subtitles from direct HLS manifest links 2021-04-28 17:21:55 +05:30
Felix S
66a1b8643a [downloader/ism] Support muxing TTML subtitles 2021-04-28 17:21:45 +05:30
Felix S
15828bcf25 [downloader/hls] Handle MPEG-2 PES timestamp overflow 2021-04-28 17:21:35 +05:30
Felix S
333217f43e [downloader/hls] Remove duplicate cues using a sliding window of candidates 2021-04-28 17:21:26 +05:30
Felix S
4a2f19abbd [downloader/hls] Assemble single-file WebVTT subtitles from HLS segments 2021-04-28 17:21:14 +05:30
Felix S
5fbcebed8c [test] Test SSTR manifest parsing 2021-04-28 17:21:01 +05:30
Felix S
becdc7f82c [test] Test subtitle extraction from DASH manifests 2021-04-28 17:20:49 +05:30
Felix S
73b9088a1c [test] Test subtitle extraction from HLS manifests 2021-04-28 17:20:39 +05:30
Felix S
f6a1d69a87 [extractor/common] Extend _extract_akamai_formats to also extract subtitle tracks 2021-04-28 17:20:29 +05:30
Felix S
fd76a14259 [extractor/common, downloader/ism] Extract SSTR subtitle tracks
_parse_ism_formats was extended into _parse_ism_formats_and_subtitles;
all direct users were updated, though _extract_ism_formats was left
as a compatibility wrapper.

The SSTR downloader was also modified in order to prepare for muxing
subtitle streams, although no support for any subtitle codecs was
added in this commit.
2021-04-28 17:20:20 +05:30
Felix S
171e59edd4 [extractor/common] Extract DASH subtitle tracks
_extract_mpd_formats and _parse_mpd_formats were extended into
_…_formats_and_subtitles; wrappers with old names are provided
for compatibility.
2021-04-28 17:20:11 +05:30
Felix S
a0c3b2d5cf [extractor/common] Extract HLS subtitle tracks
_extract_m3u8_formats is renamed to _extract_m3u8_formats_and_subtitles
and extended to handle subtitle tracks instead of skipping them;
a wrapper with the old name is provided for compatibility.

_parse_m3u8_formats is likewise renamed and extended, but without adding
the compatibility wrapper; the test suite is adjusted to test the enhanced
method instead.
2021-04-28 17:19:57 +05:30
Felix S
19bb39202d [extractor/common] Generalise _merge_subtitles
This allows modifying a subtitles dictionary in-place.
2021-04-28 17:19:46 +05:30
Felix S
d4553567d2 [downloader/ism] Prevent writing the header again when resuming an interrupted download 2021-04-28 17:19:37 +05:30
Felix S
4d49884c58 [downloader/fragment] Allow persisting extra state when a download is interrupted 2021-04-28 17:19:31 +05:30
Felix S
5873d4ccdd [utils] Improve bug_report_message
Add an optional argument specifying the text that should go before
the message.
2021-04-28 17:19:23 +05:30
Hadi0609
db9a564b6a [zee5] Fix extraction for some URLs (#279)
Closes: #278
2021-04-28 14:51:54 +05:30
Felix S
c72967d5de [mediasite] Generalize URL pattern (#275)
Authored by: fstirlitz
2021-04-26 17:23:20 +05:30
pukkandan
598d185db1 Fix case sensitivity of format selector
Bug introduced in f8d4ad9ab0
2021-04-26 10:56:56 +05:30
pukkandan
b982cbdd0e [limelight] Obey allow_unplayable_formats 2021-04-26 10:56:55 +05:30
pukkandan
6a04a74e8b [FormatSort] Fix for when some formats have quality and others don't 2021-04-26 10:56:54 +05:30
pukkandan
88728713c8 Py2 compatibility for FileNotFoundError 2021-04-26 10:56:53 +05:30
CXwudi
6b1d8c1e30 [niconico] Fix title and thumbnail extraction (#273)
Authored by: CXwudi
2021-04-26 08:23:57 +05:30
Ashish
87c3d06271 [Mxplayer] Add MxplayerShowIE (#270)
Authored by: Ashish0804
2021-04-26 08:12:51 +05:30
pukkandan
915f911e36 [utils] Encode URLs in YoutubeDLCookieProcessor
Closes #263
2021-04-24 19:20:07 +05:30
pukkandan
cf9d6cfb0c [tubi] Raise "no video formats" error when video url is empty
Related: #266
2021-04-24 17:52:33 +05:30
pukkandan
bbed5763f1 [francetvinfo] Improve video id extraction
Closes #261
2021-04-23 00:01:09 +05:30
pukkandan
ca0b91b39e [version] update :ci skip all 2021-04-22 17:30:36 +05:30
80 changed files with 4158 additions and 885 deletions

View File

@@ -21,7 +21,7 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.04.11. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.04.22. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/yt-dlp/yt-dlp.
- Search the bugtracker for similar issues: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
@@ -29,7 +29,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running yt-dlp version **2021.04.11**
- [ ] I've verified that I'm running yt-dlp version **2021.04.22**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
@@ -44,7 +44,7 @@ Add the `-v` flag to your command line you run yt-dlp with (`yt-dlp -v <your com
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] yt-dlp version 2021.04.11
[debug] yt-dlp version 2021.04.22
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -21,7 +21,7 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.04.11. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.04.22. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://github.com/yt-dlp/yt-dlp. yt-dlp does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
@@ -29,7 +29,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running yt-dlp version **2021.04.11**
- [ ] I've verified that I'm running yt-dlp version **2021.04.22**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones

View File

@@ -21,13 +21,13 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.04.11. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.04.22. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running yt-dlp version **2021.04.11**
- [ ] I've verified that I'm running yt-dlp version **2021.04.22**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones

View File

@@ -21,7 +21,7 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.04.11. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.04.22. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/yt-dlp/yt-dlp.
- Search the bugtracker for similar issues: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
@@ -30,7 +30,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running yt-dlp version **2021.04.11**
- [ ] I've verified that I'm running yt-dlp version **2021.04.22**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
@@ -46,7 +46,7 @@ Add the `-v` flag to your command line you run yt-dlp with (`yt-dlp -v <your com
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] yt-dlp version 2021.04.11
[debug] yt-dlp version 2021.04.22
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -21,13 +21,13 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.04.11. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.04.22. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running yt-dlp version **2021.04.11**
- [ ] I've verified that I'm running yt-dlp version **2021.04.22**
- [ ] I've searched the bugtracker for similar feature requests including closed ones

View File

@@ -41,11 +41,18 @@ jobs:
- name: Install Jython
if: ${{ matrix.python-impl == 'jython' }}
run: |
wget http://search.maven.org/remotecontent?filepath=org/python/jython-installer/2.7.1/jython-installer-2.7.1.jar -O jython-installer.jar
wget https://repo1.maven.org/maven2/org/python/jython-installer/2.7.1/jython-installer-2.7.1.jar -O jython-installer.jar
java -jar jython-installer.jar -s -d "$HOME/jython"
echo "$HOME/jython/bin" >> $GITHUB_PATH
- name: Install nose
if: ${{ matrix.python-impl != 'jython' }}
run: pip install nose
- name: Install nose (Jython)
if: ${{ matrix.python-impl == 'jython' }}
# Working around deprecation of support for non-SNI clients at PyPI CDN (see https://status.python.org/incidents/hzmjhqsdjqgb)
run: |
wget https://files.pythonhosted.org/packages/99/4f/13fb671119e65c4dce97c60e67d3fd9e6f7f809f2b307e2611f4701205cb/nose-1.3.7-py2-none-any.whl
pip install nose-1.3.7-py2-none-any.whl
- name: Run tests
continue-on-error: ${{ matrix.ytdl-test-set == 'download' || matrix.python-impl == 'jython' }}
env:

View File

@@ -41,11 +41,18 @@ jobs:
- name: Install Jython
if: ${{ matrix.python-impl == 'jython' }}
run: |
wget http://search.maven.org/remotecontent?filepath=org/python/jython-installer/2.7.1/jython-installer-2.7.1.jar -O jython-installer.jar
wget https://repo1.maven.org/maven2/org/python/jython-installer/2.7.1/jython-installer-2.7.1.jar -O jython-installer.jar
java -jar jython-installer.jar -s -d "$HOME/jython"
echo "$HOME/jython/bin" >> $GITHUB_PATH
- name: Install nose
if: ${{ matrix.python-impl != 'jython' }}
run: pip install nose
- name: Install nose (Jython)
if: ${{ matrix.python-impl == 'jython' }}
# Working around deprecation of support for non-SNI clients at PyPI CDN (see https://status.python.org/incidents/hzmjhqsdjqgb)
run: |
wget https://files.pythonhosted.org/packages/99/4f/13fb671119e65c4dce97c60e67d3fd9e6f7f809f2b307e2611f4701205cb/nose-1.3.7-py2-none-any.whl
pip install nose-1.3.7-py2-none-any.whl
- name: Run tests
continue-on-error: ${{ matrix.ytdl-test-set == 'download' || matrix.python-impl == 'jython' }}
env:

View File

@@ -40,3 +40,7 @@ hheimbuerger
B0pol
lkho
fstirlitz
Lamieur
tsukumijima
Hadi0609
b5eff52

View File

@@ -19,6 +19,61 @@
-->
### 2021.05.11
* **Deprecate support for python versions < 3.6**
* **Subtitle extraction from manifests** by [fstirlitz](https://github.com/fstirlitz). See [be6202f12b97858b9d716e608394b51065d0419f](https://github.com/yt-dlp/yt-dlp/commit/be6202f12b97858b9d716e608394b51065d0419f) for details
* **Improve output template:**
* Allow slicing lists/strings using `field.start:end:step`
* A field can also be used as offset like `field1+num+field2`
* A default value can be given using `field|default`
* Prevent invalid fields from causing errors
* Merge youtube-dl: Upto [commit/a726009](https://github.com/ytdl-org/youtube-dl/commit/a7260099873acc6dc7d76cafad2f6b139087afd0)
* **Remove options** `-l`, `-t`, `-A` completely and disable `--auto-number`, `--title`, `--literal`, `--id`
* [Plugins] Prioritize plugins over standard extractors and prevent plugins from overwriting the standard extractor classes
* [downloader] Fix `quiet` and `to_stderr`
* [fragment] Ensure the file is closed on error
* [fragment] Make sure first segment is not skipped
* [aria2c] Fix whitespace being stripped off
* [embedthumbnail] Fix bug where jpeg thumbnails were converted again
* [FormatSort] Fix for when some formats have quality and others don't
* [utils] Add `network_exceptions`
* [utils] Escape URL while sanitizing
* [ukcolumn] Add Extractor
* [whowatch] Add extractor by [nao20010128nao](https://github.com/nao20010128nao)
* [CBS] Improve `_VALID_URL` to support movies
* [crackle] Improve extraction
* [curiositystream] Fix collections
* [francetvinfo] Improve video id extraction
* [generic] Respect the encoding in manifest
* [limelight] Obey `allow_unplayable_formats`
* [mediasite] Generalize URL pattern by [fstirlitz](https://github.com/fstirlitz)
* [mxplayer] Add MxplayerShowIE by [Ashish0804](https://github.com/Ashish0804)
* [nebula] Move to nebula.app by [Lamieur](https://github.com/Lamieur)
* [niconico] Fix HLS formats by [CXwudi](https://github.com/CXwudi), [tsukumijima](https://github.com/tsukumijima), [nao20010128nao](https://github.com/nao20010128nao) and [pukkandan](https://github.com/pukkandan)
* [niconico] Fix title and thumbnail extraction by [CXwudi](https://github.com/CXwudi)
* [plutotv] Extract subtitles from manifests
* [plutotv] Fix format extraction for some urls
* [rmcdecouverte] Improve `_VALID_URL`
* [sonyliv] Fix `title` and `series` extraction by [Ashish0804](https://github.com/Ashish0804)
* [tubi] Raise "no video formats" error when video url is empty
* [youtube:tab] Detect playlists inside community posts
* [youtube] Add `oembed` to reserved names
* [zee5] Fix extraction for some URLs by [Hadi0609](https://github.com/Hadi0609)
* [zee5] Fix py2 compatibility
* Fix `playlist_index` and add `playlist_autonumber`. See [#302](https://github.com/yt-dlp/yt-dlp/issues/302) for details
* Add experimental option `--check-formats` to test the URLs before format selection
* Option `--compat-options` to revert some of yt-dlp's changes
* Deprecates `--list-formats-as-table`, `--list-formats-old`
* Fix number of digits in `%(playlist_index)s`
* Fix case sensitivity of format selector
* Revert "[core] be able to hand over id and title using url_result"
* Do not strip out whitespaces in `-o` and `-P`
* Fix `preload_download_archive` writing verbose message to `stdout`
* Move option warnings to `YoutubeDL`so that they obey `--no-warnings` and can output colors
* Py2 compatibility for `FileNotFoundError`
### 2021.04.22
* **Improve output template:**
* Objects can be traversed like `%(field.key1.key2)s`
@@ -33,11 +88,11 @@
* Add option `--skip-playlist-after-errors` to skip the rest of a playlist after a given number of errors are encountered
* Merge youtube-dl: Upto [commit/7e8b3f9](https://github.com/ytdl-org/youtube-dl/commit/7e8b3f9439ebefb3a3a4e5da9c0bd2b595976438)
* [downloader] Fix bug in downloader selection
* [BilibiliChannel] Fix pagination by [nao20010128nao](https://github.com/nao20010128nao) and[pukkandan](https://github.com/pukkandan)
* [BilibiliChannel] Fix pagination by [nao20010128nao](https://github.com/nao20010128nao) and [pukkandan](https://github.com/pukkandan)
* [rai] Add support for http formats by [nixxo](https://github.com/nixxo)
* [TubiTv] Add TubiTvShowIE by [Ashish0804](https://github.com/Ashish0804)
* [twitcasting] Fix extractor
* [viu:ott] Fix extractor and support series by [lkho](https://github.com/lkho) and[pukkandan](https://github.com/pukkandan)
* [viu:ott] Fix extractor and support series by [lkho](https://github.com/lkho) and [pukkandan](https://github.com/pukkandan)
* [youtube:tab] Show unavailable videos in playlists by [colethedj](https://github.com/colethedj)
* [youtube:tab] Reload with unavailable videos for all playlists
* [youtube] Ignore invalid stretch ratio

View File

@@ -20,6 +20,7 @@ A command-line program to download videos from YouTube and many other [video pla
yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on the now inactive [youtube-dlc](https://github.com/blackjack4494/yt-dlc). The main focus of this project is adding new features and patches while also keeping up to date with the original project
* [NEW FEATURES](#new-features)
* [Differences in default behavior](#differences-in-default-behavior)
* [INSTALLATION](#installation)
* [Dependencies](#dependencies)
* [Update](#update)
@@ -65,9 +66,9 @@ The major new features from the latest release of [blackjack4494/yt-dlc](https:/
* **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection that what is possible by simply using `--format` ([examples](#format-selection-examples))
* **Merged with youtube-dl v2021.04.17**: You get all the latest features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494/yt-dlc)
* **Merged with youtube-dl [commit/a726009](https://github.com/ytdl-org/youtube-dl/commit/a7260099873acc6dc7d76cafad2f6b139087afd0)**: (v2021.04.26) You get all the latest features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494/yt-dlc)
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--get-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, Playlist infojson etc. Note that the NicoNico improvements are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--get-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, playlist infojson etc. Note that the NicoNico improvements are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
* **Youtube improvements**:
* All Youtube Feeds (`:ytfav`, `:ytwatchlater`, `:ytsubs`, `:ythistory`, `:ytrec`) works and supports downloading multiple pages of content
@@ -81,17 +82,21 @@ The major new features from the latest release of [blackjack4494/yt-dlc](https:/
* **Aria2c with HLS/DASH**: You can use `aria2c` as the external downloader for DASH(mpd) and HLS(m3u8) formats
* **New extractors**: AnimeLab, Philo MSO, Rcs, Gedi, bitwave.tv, mildom, audius, zee5, mtv.it, wimtv, pluto.tv, niconico users, discoveryplus.in, mediathek, NFHSNetwork, nebula
* **New extractors**: AnimeLab, Philo MSO, Rcs, Gedi, bitwave.tv, mildom, audius, zee5, mtv.it, wimtv, pluto.tv, niconico users, discoveryplus.in, mediathek, NFHSNetwork, nebula, ukcolumn, whowatch, MxplayerShow
* **Fixed extractors**: archive.org, roosterteeth.com, skyit, instagram, itv, SouthparkDe, spreaker, Vlive, akamai, ina, rumble, tennistv, amcnetworks, la7 podcasts, linuxacadamy, nitter, twitcasting, viu
* **Fixed extractors**: archive.org, roosterteeth.com, skyit, instagram, itv, SouthparkDe, spreaker, Vlive, akamai, ina, rumble, tennistv, amcnetworks, la7 podcasts, linuxacadamy, nitter, twitcasting, viu, crackle, curiositystream, mediasite, rmcdecouverte, sonyliv, tubi
* **Subtitle extraction from manifests**: Subtitles can be extracted from streaming media manifests. See [be6202f12b97858b9d716e608394b51065d0419f](https://github.com/yt-dlp/yt-dlp/commit/be6202f12b97858b9d716e608394b51065d0419f) for details
* **Multiple paths and output templates**: You can give different [output templates](#output-template) and download paths for different types of files. You can also set a temporary path where intermediary files are downloaded to using `--paths` (`-P`)
* **Portable Configuration**: Configuration files are automatically loaded from the home and root directories. See [configuration](#configuration) for details
* **Other new options**: `--parse-metadata`, `--list-formats-as-table`, `--write-link`, `--force-download-archive`, `--force-overwrites`, `--break-on-reject` etc
* **Output template improvements**: Output templates can now have date-time formatting, numeric offsets, object traversal etc. See [output template](#output-template) for details. Even more advanced operations can also be done with the help of `--parse-metadata`
* **Improvements**: Multiple `--postprocessor-args` and `--external-downloader-args`, Date/time formatting in `-o`, faster archive checking, more [format selection options](#format-selection) etc
* **Other new options**: `--sleep-requests`, `--convert-thumbnails`, `--write-link`, `--force-download-archive`, `--force-overwrites`, `--break-on-reject` etc
* **Improvements**: Multiple `--postprocessor-args` and `--downloader-args`, faster archive checking, more [format selection options](#format-selection) etc
* **Plugin extractors**: Extractors can be loaded from an external file. See [plugins](#plugins) for details
@@ -105,15 +110,40 @@ See [changelog](Changelog.md) or [commits](https://github.com/yt-dlp/yt-dlp/comm
If you are coming from [youtube-dl](https://github.com/ytdl-org/youtube-dl), the amount of changes are very large. Compare [options](#options) and [supported sites](supportedsites.md) with youtube-dl's to get an idea of the massive number of features/patches [youtube-dlc](https://github.com/blackjack4494/yt-dlc) has accumulated.
### Differences in default behavior
Some of yt-dlp's default options are different from that of youtube-dl and youtube-dlc.
1. The options `--id`, `--auto-number` (`-A`), `--title` (`-t`) and `--literal` (`-l`), no longer work. See [removed options](#Removed) for details
1. `avconv` is not supported as as an alternative to `ffmpeg`
1. The default [output template](#output-template) is `%(title)s [%(id)s].%(ext)s`. There is no real reason for this change. This was changed before yt-dlp was ever made public and now there are no plans to change it back to `%(title)s.%(id)s.%(ext)s`. Instead, you may use `--compat-options filename`
1. The default [format sorting](sorting-formats) is different from youtube-dl and prefers higher resolution and better codecs rather than higher bitrates. You can use the `--format-sort` option to change this to any order you prefer, or use `--compat-options format-sort` to use youtube-dl's sorting order
1. The default format selector is `bv*+ba/b`. This means that if a combined video + audio format that is better than the best video-only format is found, the former will be prefered. Use `-f bv+ba/b` or `--compat-options format-spec` to revert this
1. Unlike youtube-dlc, yt-dlp does not allow merging multiple audio/video streams into one file by default (since this conflicts with the use of `-f bv*+ba`). If needed, this feature must be enabled using `--audio-multistreams` and `--video-multistreams`. You can also use `--compat-options multistreams` to enable both
1. `--ignore-errors` is enabled by default. Use `--abort-on-error` or `--compat-options abort-on-error` to abort on errors instead
1. When writing metadata files such as thumbnails, description or infojson, the same information (if available) is also written for playlists. Use `--no-write-playlist-metafiles` or `--compat-options no-playlist-metafiles` to not write these files
1. `playlist_index` behaves differently when used with options like `--playlist-reverse` and `--playlist-items`. See [#302](https://github.com/yt-dlp/yt-dlp/issues/302) for details. You can use `--compat-options playlist-index` if you want to keep the earlier behavior
1. The output of `-F` is listed in a new format. Use `--compat-options list-formats` to revert this
1. Youtube live chat (if available) is considered as a subtitle. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent live chat from downloading
1. Youtube channel URLs are automatically redirected to `/video`. Either append a `/featured` to the URL or use `--compat-options no-youtube-channel-redirect` to download only the videos in the home page
1. Unavailable videos are also listed for youtube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this
For ease of use, a few more compat options are available:
1. `--compat-options all` = Use all compat options
1. `--compat-options youtube-dl` = `--compat-options all,-multistreams`
1. `--compat-options youtube-dlc` = `--compat-options all,-no-live-chat,-no-youtube-channel-redirect`
# INSTALLATION
yt-dlp is not platform specific. So it should work on your Unix box, on Windows or on macOS
You can install yt-dlp using one of the following methods:
* Download the binary from the [latest release](https://github.com/yt-dlp/yt-dlp/releases/latest) (recommended method)
* Use [PyPI package](https://pypi.org/project/yt-dlp): `python -m pip install --upgrade yt-dlp`
* Use pip+git: `python -m pip install --upgrade git+https://github.com/yt-dlp/yt-dlp.git@release`
* Install master branch: `python -m pip install --upgrade git+https://github.com/yt-dlp/yt-dlp`
* Use [PyPI package](https://pypi.org/project/yt-dlp): `python3 -m pip install --upgrade yt-dlp`
* Use pip+git: `python3 -m pip install --upgrade git+https://github.com/yt-dlp/yt-dlp.git@release`
* Install master branch: `python3 -m pip install --upgrade git+https://github.com/yt-dlp/yt-dlp`
Note that on some systems, you may need to use `py` or `python` instead of `python3`
UNIX users (Linux, macOS, BSD) can also install the [latest release](https://github.com/yt-dlp/yt-dlp/releases/latest) one of the following ways:
@@ -133,9 +163,11 @@ sudo chmod a+rx /usr/local/bin/yt-dlp
```
### DEPENDENCIES
Python versions 2.6, 2.7, or 3.2+ are currently supported. However, 3.2+ is strongly recommended and python2 support will be deprecated in the future.
Python versions 3.6+ (CPython and PyPy) are officially supported. Other versions and implementations may or maynot work correctly.
Although there are no required dependencies, `ffmpeg` and `ffprobe` are highly recommended. Other optional dependencies are `sponskrub`, `AtomicParsley`, `mutagen`, `pycryptodome` and any of the supported external downloaders. Note that the windows releases are already built with the python interpreter, mutagen and pycryptodome included.
On windows, [Microsoft Visual C++ 2010 Redistributable Package (x86)](https://www.microsoft.com/en-us/download/details.aspx?id=26999) is also necessary to run yt-dlp. You probably already have this, but if the executable throws an error due to missing `MSVCR100.dll` you need to install it.
Although there are no other required dependencies, `ffmpeg` and `ffprobe` are highly recommended. Other optional dependencies are `sponskrub`, `AtomicParsley`, `mutagen`, `pycryptodome` and any of the supported external downloaders. Note that the windows releases are already built with the python interpreter, mutagen and pycryptodome included.
### UPDATE
You can use `yt-dlp -U` to update if you are using the provided release.
@@ -146,13 +178,15 @@ If you are using `pip`, simply re-run the same command that was used to install
**For Windows**:
To build the Windows executable, you must have pyinstaller (and optionally mutagen and pycryptodome)
python -m pip install --upgrade pyinstaller mutagen pycryptodome
python3 -m pip install --upgrade pyinstaller mutagen pycryptodome
Once you have all the necessary dependencies installed, just run `py pyinst.py`. The executable will be built for the same architecture (32/64 bit) as the python used to build it. It is strongly recommended to use python3 although python2.6+ is supported.
Once you have all the necessary dependencies installed, just run `py pyinst.py`. The executable will be built for the same architecture (32/64 bit) as the python used to build it.
You can also build the executable without any version info or metadata by using:
pyinstaller.exe yt_dlp\__main__.py --onefile --name yt-dlp
Note that pyinstaller [does not support](https://github.com/pyinstaller/pyinstaller#requirements-and-tested-platforms) Python installed from the Windows store without using a virtual environment
**For Unix**:
You will need the required build tools: `python`, `make` (GNU), `pandoc`, `zip`, `nosetests`
@@ -210,6 +244,11 @@ Then simply run `make`. You can also run `make yt-dlp` instead to compile only t
--mark-watched Mark videos watched (YouTube only)
--no-mark-watched Do not mark videos watched (default)
--no-colors Do not emit color codes in output
--compat-options OPTS Options that can help keep compatibility
with youtube-dl and youtube-dlc
configurations by reverting some of the
changes made in yt-dlp. See "Differences in
default behavior" for details
## Network Options:
--proxy URL Use the specified HTTP/HTTPS/SOCKS proxy.
@@ -577,12 +616,10 @@ Then simply run `make`. You can also run `make yt-dlp` instead to compile only t
containers irrespective of quality
--no-prefer-free-formats Don't give any special preference to free
containers (default)
--check-formats Check that the formats selected are
actually downloadable (Experimental)
-F, --list-formats List all available formats of requested
videos
--list-formats-as-table Present the output of -F in tabular form
(default)
--list-formats-old Present the output of -F in the old form
(Alias: --no-list-formats-as-table)
--merge-output-format FORMAT If a merge is required (e.g.
bestvideo+bestaudio), output to given
container format. One of mkv, mp4, ogg,
@@ -842,13 +879,14 @@ The simplest usage of `-o` is not to set any template arguments when downloading
It may however also contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by formatting operations.
The field names themselves (the part inside the parenthesis) can also have some special formatting:
1. **Date/time Formatting**: Date/time fields can be formatted according to [strftime formatting](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) by specifying it separated from the field name using a `>`. Eg: `%(duration>%H-%M-%S)s` or `%(upload_date>%Y-%m-%d)s`
2. **Offset numbers**: Numeric fields can have an initial offset specified by using a `+` separator. Eg: `%(playlist_index+10)03d`. This can also be used in conjunction with the date-time formatting. Eg: `%(epoch+-3600>%H-%M-%S)s`
3. **Object traversal**: The dictionaries and lists available in metadata can be traversed by using a `.` (dot) separator. Eg: `%(tags.0)s` or `%(subtitles.en.-1.ext)`. Note that the fields that become available using this method are not listed below. Use `-j` to see such fields
1. **Object traversal**: The dictionaries and lists available in metadata can be traversed by using a `.` (dot) separator. You can also do python slicing using `:`. Eg: `%(tags.0)s`, `%(subtitles.en.-1.ext)`, `%(id.3:7:-1)s`. Note that the fields that become available using this method are not listed below. Use `-j` to see such fields
1. **Addition**: Addition and subtraction of numeric fields can be done using `+` and `-` respectively. Eg: `%(playlist_index+10)03d`, `%(n_entries+1-playlist_index)d`
1. **Date/time Formatting**: Date/time fields can be formatted according to [strftime formatting](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) by specifying it separated from the field name using a `>`. Eg: `%(duration>%H-%M-%S)s`, `%(upload_date>%Y-%m-%d)s`, `%(epoch-3600>%H-%M-%S)s`
1. **Default**: A default value can be specified for when the field is empty using a `|` seperator. This overrides `--output-na-template`. Eg: `%(uploader|Unknown)s`
To summarize, the general syntax for a field is:
```
%(name[.keys][+offset][>strf])[flags][width][.precision][length]type
%(name[.keys][addition][>strf][|default])[flags][width][.precision][length]type
```
Additionally, you can set different output templates for the various metadata files separately from the general output template by specifying the type of file followed by the template separated by a colon `:`. The different file types supported are `subtitle`, `thumbnail`, `description`, `annotation`, `infojson`, `pl_description`, `pl_infojson`, `chapter`. For example, `-o '%(title)s.%(ext)s' -o 'thumbnail:%(title)s\%(title)s.%(ext)s'` will put the thumbnails in a folder with the same name as the video.
@@ -1281,6 +1319,8 @@ While these options still work, their use is not recommended since there are oth
--metadata-from-title FORMAT --parse-metadata "%(title)s:FORMAT"
--hls-prefer-native --downloader "m3u8:native"
--hls-prefer-ffmpeg --downloader "m3u8:ffmpeg"
--list-formats-old --compat-options list-formats (Alias: --no-list-formats-as-table)
--list-formats-as-table --compat-options -list-formats [Default] (Alias: --no-list-formats-old)
--sponskrub-args ARGS --ppa "sponskrub:ARGS"
--test Used by developers for testing extractors. Not intended for the end user
@@ -1314,15 +1354,15 @@ These options may no longer work as intended
--include-ads No longer supported
--no-include-ads Default
--youtube-print-sig-code No longer supported
#### Removed
These options were deprecated since 2014 and have now been entirely removed
--id -o "%(id)s.%(ext)s"
-A, --auto-number -o "%(autonumber)s-%(id)s.%(ext)s"
-t, --title -o "%(title)s-%(id)s.%(ext)s"
-l, --literal -o accepts literal names
#### Removed
Currently, there are no options that have been completely removed. But there are plans to remove the old output options `-A`,`-t`, `-l`, `--id` (which have been deprecated since 2014) in the near future. If you are still using these, please move to using `--output` instead
# MORE
For FAQ, Developer Instructions etc., see the [original README](https://github.com/ytdl-org/youtube-dl#faq)

View File

@@ -18,13 +18,13 @@ DESCRIPTION = 'Command-line program to download videos from YouTube.com and many
LONG_DESCRIPTION = '\n\n'.join((
'Official repository: <https://github.com/yt-dlp/yt-dlp>',
'**PS**: Many links in this document will not work since this is a copy of the README.md from Github',
open("README.md", "r", encoding="utf-8").read()))
open('README.md', 'r', encoding='utf-8').read()))
REQUIREMENTS = ['mutagen', 'pycryptodome']
if len(sys.argv) >= 2 and sys.argv[1] == 'py2exe':
print("inv")
print('inv')
else:
files_spec = [
('share/bash-completion/completions', ['completions/bash/yt-dlp']),
@@ -67,17 +67,17 @@ class build_lazy_extractors(Command):
)
packages = find_packages(exclude=("youtube_dl", "test", "ytdlp_plugins"))
packages = find_packages(exclude=('youtube_dl', 'test', 'ytdlp_plugins'))
setup(
name="yt-dlp",
name='yt-dlp',
version=__version__,
maintainer="pukkandan",
maintainer_email="pukkandan.ytdlp@gmail.com",
maintainer='pukkandan',
maintainer_email='pukkandan.ytdlp@gmail.com',
description=DESCRIPTION,
long_description=LONG_DESCRIPTION,
long_description_content_type="text/markdown",
url="https://github.com/yt-dlp/yt-dlp",
long_description_content_type='text/markdown',
url='https://github.com/yt-dlp/yt-dlp',
packages=packages,
install_requires=REQUIREMENTS,
project_urls={
@@ -87,28 +87,28 @@ setup(
#'Funding': 'https://donate.pypi.org',
},
classifiers=[
"Topic :: Multimedia :: Video",
"Development Status :: 5 - Production/Stable",
"Environment :: Console",
"Programming Language :: Python",
"Programming Language :: Python :: 2",
"Programming Language :: Python :: 2.6",
"Programming Language :: Python :: 2.7",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.2",
"Programming Language :: Python :: 3.3",
"Programming Language :: Python :: 3.4",
"Programming Language :: Python :: 3.5",
"Programming Language :: Python :: 3.6",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: Implementation",
"Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: Implementation :: IronPython",
"Programming Language :: Python :: Implementation :: Jython",
"Programming Language :: Python :: Implementation :: PyPy",
"License :: Public Domain",
"Operating System :: OS Independent",
'Topic :: Multimedia :: Video',
'Development Status :: 5 - Production/Stable',
'Environment :: Console',
'Programming Language :: Python',
'Programming Language :: Python :: 2',
'Programming Language :: Python :: 2.6',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.2',
'Programming Language :: Python :: 3.3',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: Implementation',
'Programming Language :: Python :: Implementation :: CPython',
'Programming Language :: Python :: Implementation :: IronPython',
'Programming Language :: Python :: Implementation :: Jython',
'Programming Language :: Python :: Implementation :: PyPy',
'License :: Public Domain',
'Operating System :: OS Independent',
],
python_requires='>=2.6',

View File

@@ -130,7 +130,6 @@
- **bitwave:stream**
- **BleacherReport**
- **BleacherReportCMS**
- **blinkx**
- **Bloomberg**
- **BokeCC**
- **BongaCams**
@@ -225,7 +224,8 @@
- **Culturebox**
- **CultureUnplugged**
- **curiositystream**
- **curiositystream:collection**
- **curiositystream:collections**
- **curiositystream:series**
- **CWTV**
- **DagelijkseKost**: dagelijksekost.een.be
- **DailyMail**
@@ -584,6 +584,7 @@
- **Mwave**
- **MwaveMeetGreet**
- **Mxplayer**
- **MxplayerShow**
- **MyChannels**
- **MySpace**
- **MySpace:album**
@@ -1076,6 +1077,7 @@
- **UDNEmbed**: 聯合影音
- **UFCArabia**
- **UFCTV**
- **ukcolumn**
- **UKTVPlay**
- **umg:de**: Universal Music Deutschland
- **Unistra**
@@ -1194,6 +1196,7 @@
- **Weibo**
- **WeiboMobile**
- **WeiqiTV**: WQTV
- **whowatch**
- **WimTV**
- **Wistia**
- **WistiaPlaylist**
@@ -1204,7 +1207,7 @@
- **WWE**
- **XBef**
- **XboxClips**
- **XFileShare**: XFileShare based sites: Aparat, ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, XVideoSharing
- **XFileShare**: XFileShare based sites: Aparat, ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, WolfStream, XVideoSharing
- **XHamster**
- **XHamsterEmbed**
- **XHamsterUser**

View File

@@ -684,17 +684,186 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'width': 1920,
'height': 1080,
'vcodec': 'avc1.64002a',
}]
}],
{}
),
(
'bipbop_16x9',
'https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/bipbop_16x9_variant.m3u8',
[{
"format_id": "bipbop_audio-BipBop Audio 2",
"format_index": None,
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/alternate_audio_aac/prog_index.m3u8",
"manifest_url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/bipbop_16x9_variant.m3u8",
"language": "eng",
"ext": "mp4",
"protocol": "m3u8",
"preference": None,
"quality": None,
"vcodec": "none",
"audio_ext": "mp4",
"video_ext": "none",
}, {
"format_id": "41",
"format_index": None,
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/gear0/prog_index.m3u8",
"manifest_url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/bipbop_16x9_variant.m3u8",
"tbr": 41.457,
"ext": "mp4",
"fps": None,
"protocol": "m3u8",
"preference": None,
"quality": None,
"vcodec": "none",
"acodec": "mp4a.40.2",
"audio_ext": "mp4",
"video_ext": "none",
"abr": 41.457,
}, {
"format_id": "263",
"format_index": None,
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/gear1/prog_index.m3u8",
"manifest_url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/bipbop_16x9_variant.m3u8",
"tbr": 263.851,
"ext": "mp4",
"fps": None,
"protocol": "m3u8",
"preference": None,
"quality": None,
"width": 416,
"height": 234,
"vcodec": "avc1.4d400d",
"acodec": "mp4a.40.2",
"video_ext": "mp4",
"audio_ext": "none",
"vbr": 263.851,
"abr": 0,
}, {
"format_id": "577",
"format_index": None,
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/gear2/prog_index.m3u8",
"manifest_url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/bipbop_16x9_variant.m3u8",
"tbr": 577.61,
"ext": "mp4",
"fps": None,
"protocol": "m3u8",
"preference": None,
"quality": None,
"width": 640,
"height": 360,
"vcodec": "avc1.4d401e",
"acodec": "mp4a.40.2",
"video_ext": "mp4",
"audio_ext": "none",
"vbr": 577.61,
"abr": 0,
}, {
"format_id": "915",
"format_index": None,
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/gear3/prog_index.m3u8",
"manifest_url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/bipbop_16x9_variant.m3u8",
"tbr": 915.905,
"ext": "mp4",
"fps": None,
"protocol": "m3u8",
"preference": None,
"quality": None,
"width": 960,
"height": 540,
"vcodec": "avc1.4d401f",
"acodec": "mp4a.40.2",
"video_ext": "mp4",
"audio_ext": "none",
"vbr": 915.905,
"abr": 0,
}, {
"format_id": "1030",
"format_index": None,
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/gear4/prog_index.m3u8",
"manifest_url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/bipbop_16x9_variant.m3u8",
"tbr": 1030.138,
"ext": "mp4",
"fps": None,
"protocol": "m3u8",
"preference": None,
"quality": None,
"width": 1280,
"height": 720,
"vcodec": "avc1.4d401f",
"acodec": "mp4a.40.2",
"video_ext": "mp4",
"audio_ext": "none",
"vbr": 1030.138,
"abr": 0,
}, {
"format_id": "1924",
"format_index": None,
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/gear5/prog_index.m3u8",
"manifest_url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/bipbop_16x9_variant.m3u8",
"tbr": 1924.009,
"ext": "mp4",
"fps": None,
"protocol": "m3u8",
"preference": None,
"quality": None,
"width": 1920,
"height": 1080,
"vcodec": "avc1.4d401f",
"acodec": "mp4a.40.2",
"video_ext": "mp4",
"audio_ext": "none",
"vbr": 1924.009,
"abr": 0,
}],
{
"en": [{
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/subtitles/eng/prog_index.m3u8",
"ext": "vtt",
"protocol": "m3u8_native"
}, {
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/subtitles/eng_forced/prog_index.m3u8",
"ext": "vtt",
"protocol": "m3u8_native"
}],
"fr": [{
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/subtitles/fra/prog_index.m3u8",
"ext": "vtt",
"protocol": "m3u8_native"
}, {
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/subtitles/fra_forced/prog_index.m3u8",
"ext": "vtt",
"protocol": "m3u8_native"
}],
"es": [{
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/subtitles/spa/prog_index.m3u8",
"ext": "vtt",
"protocol": "m3u8_native"
}, {
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/subtitles/spa_forced/prog_index.m3u8",
"ext": "vtt",
"protocol": "m3u8_native"
}],
"ja": [{
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/subtitles/jpn/prog_index.m3u8",
"ext": "vtt",
"protocol": "m3u8_native"
}, {
"url": "https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_16x9/subtitles/jpn_forced/prog_index.m3u8",
"ext": "vtt",
"protocol": "m3u8_native"
}],
}
),
]
for m3u8_file, m3u8_url, expected_formats in _TEST_CASES:
for m3u8_file, m3u8_url, expected_formats, expected_subs in _TEST_CASES:
with io.open('./test/testdata/m3u8/%s.m3u8' % m3u8_file,
mode='r', encoding='utf-8') as f:
formats = self.ie._parse_m3u8_formats(
formats, subs = self.ie._parse_m3u8_formats_and_subtitles(
f.read(), m3u8_url, ext='mp4')
self.ie._sort_formats(formats)
expect_value(self, formats, expected_formats, None)
expect_value(self, subs, expected_subs, None)
def test_parse_mpd_formats(self):
_TEST_CASES = [
@@ -780,7 +949,8 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'tbr': 5997.485,
'width': 1920,
'height': 1080,
}]
}],
{},
), (
# https://github.com/ytdl-org/youtube-dl/pull/14844
'urls_only',
@@ -863,7 +1033,8 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'tbr': 4400,
'width': 1920,
'height': 1080,
}]
}],
{},
), (
# https://github.com/ytdl-org/youtube-dl/issues/20346
# Media considered unfragmented even though it contains
@@ -909,18 +1080,328 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'width': 360,
'height': 360,
'fps': 30,
}]
}],
{},
), (
'subtitles',
'https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd',
'https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/',
[{
"format_id": "audio=128001",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"ext": "m4a",
"tbr": 128.001,
"asr": 48000,
"format_note": "DASH audio",
"container": "m4a_dash",
"vcodec": "none",
"acodec": "mp4a.40.2",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"fragment_base_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/dash/",
"protocol": "http_dash_segments",
"audio_ext": "m4a",
"video_ext": "none",
"abr": 128.001,
}, {
"format_id": "video=100000",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"ext": "mp4",
"width": 336,
"height": 144,
"tbr": 100,
"format_note": "DASH video",
"container": "mp4_dash",
"vcodec": "avc1.4D401F",
"acodec": "none",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"fragment_base_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/dash/",
"protocol": "http_dash_segments",
"video_ext": "mp4",
"audio_ext": "none",
"vbr": 100,
}, {
"format_id": "video=326000",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"ext": "mp4",
"width": 562,
"height": 240,
"tbr": 326,
"format_note": "DASH video",
"container": "mp4_dash",
"vcodec": "avc1.4D401F",
"acodec": "none",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"fragment_base_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/dash/",
"protocol": "http_dash_segments",
"video_ext": "mp4",
"audio_ext": "none",
"vbr": 326,
}, {
"format_id": "video=698000",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"ext": "mp4",
"width": 844,
"height": 360,
"tbr": 698,
"format_note": "DASH video",
"container": "mp4_dash",
"vcodec": "avc1.4D401F",
"acodec": "none",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"fragment_base_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/dash/",
"protocol": "http_dash_segments",
"video_ext": "mp4",
"audio_ext": "none",
"vbr": 698,
}, {
"format_id": "video=1493000",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"ext": "mp4",
"width": 1126,
"height": 480,
"tbr": 1493,
"format_note": "DASH video",
"container": "mp4_dash",
"vcodec": "avc1.4D401F",
"acodec": "none",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"fragment_base_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/dash/",
"protocol": "http_dash_segments",
"video_ext": "mp4",
"audio_ext": "none",
"vbr": 1493,
}, {
"format_id": "video=4482000",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"ext": "mp4",
"width": 1688,
"height": 720,
"tbr": 4482,
"format_note": "DASH video",
"container": "mp4_dash",
"vcodec": "avc1.4D401F",
"acodec": "none",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"fragment_base_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/dash/",
"protocol": "http_dash_segments",
"video_ext": "mp4",
"audio_ext": "none",
"vbr": 4482,
}],
{
"en": [
{
"ext": "mp4",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/manifest.mpd",
"fragment_base_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/dash/",
"protocol": "http_dash_segments",
}
]
},
)
]
for mpd_file, mpd_url, mpd_base_url, expected_formats in _TEST_CASES:
for mpd_file, mpd_url, mpd_base_url, expected_formats, expected_subtitles in _TEST_CASES:
with io.open('./test/testdata/mpd/%s.mpd' % mpd_file,
mode='r', encoding='utf-8') as f:
formats = self.ie._parse_mpd_formats(
formats, subtitles = self.ie._parse_mpd_formats_and_subtitles(
compat_etree_fromstring(f.read().encode('utf-8')),
mpd_base_url=mpd_base_url, mpd_url=mpd_url)
self.ie._sort_formats(formats)
expect_value(self, formats, expected_formats, None)
expect_value(self, subtitles, expected_subtitles, None)
def test_parse_ism_formats(self):
_TEST_CASES = [
(
'sintel',
'https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest',
[{
"format_id": "audio-128",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"ext": "isma",
"tbr": 128,
"asr": 48000,
"vcodec": "none",
"acodec": "AACL",
"protocol": "ism",
"_download_params": {
"stream_type": "audio",
"duration": 8880746666,
"timescale": 10000000,
"width": 0,
"height": 0,
"fourcc": "AACL",
"codec_private_data": "1190",
"sampling_rate": 48000,
"channels": 2,
"bits_per_sample": 16,
"nal_unit_length_field": 4
},
"audio_ext": "isma",
"video_ext": "none",
"abr": 128,
}, {
"format_id": "video-100",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"ext": "ismv",
"width": 336,
"height": 144,
"tbr": 100,
"vcodec": "AVC1",
"acodec": "none",
"protocol": "ism",
"_download_params": {
"stream_type": "video",
"duration": 8880746666,
"timescale": 10000000,
"width": 336,
"height": 144,
"fourcc": "AVC1",
"codec_private_data": "00000001674D401FDA0544EFFC2D002CBC40000003004000000C03C60CA80000000168EF32C8",
"channels": 2,
"bits_per_sample": 16,
"nal_unit_length_field": 4
},
"video_ext": "ismv",
"audio_ext": "none",
"vbr": 100,
}, {
"format_id": "video-326",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"ext": "ismv",
"width": 562,
"height": 240,
"tbr": 326,
"vcodec": "AVC1",
"acodec": "none",
"protocol": "ism",
"_download_params": {
"stream_type": "video",
"duration": 8880746666,
"timescale": 10000000,
"width": 562,
"height": 240,
"fourcc": "AVC1",
"codec_private_data": "00000001674D401FDA0241FE23FFC3BC83BA44000003000400000300C03C60CA800000000168EF32C8",
"channels": 2,
"bits_per_sample": 16,
"nal_unit_length_field": 4
},
"video_ext": "ismv",
"audio_ext": "none",
"vbr": 326,
}, {
"format_id": "video-698",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"ext": "ismv",
"width": 844,
"height": 360,
"tbr": 698,
"vcodec": "AVC1",
"acodec": "none",
"protocol": "ism",
"_download_params": {
"stream_type": "video",
"duration": 8880746666,
"timescale": 10000000,
"width": 844,
"height": 360,
"fourcc": "AVC1",
"codec_private_data": "00000001674D401FDA0350BFB97FF06AF06AD1000003000100000300300F1832A00000000168EF32C8",
"channels": 2,
"bits_per_sample": 16,
"nal_unit_length_field": 4
},
"video_ext": "ismv",
"audio_ext": "none",
"vbr": 698,
}, {
"format_id": "video-1493",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"ext": "ismv",
"width": 1126,
"height": 480,
"tbr": 1493,
"vcodec": "AVC1",
"acodec": "none",
"protocol": "ism",
"_download_params": {
"stream_type": "video",
"duration": 8880746666,
"timescale": 10000000,
"width": 1126,
"height": 480,
"fourcc": "AVC1",
"codec_private_data": "00000001674D401FDA011C3DE6FFF0D890D871000003000100000300300F1832A00000000168EF32C8",
"channels": 2,
"bits_per_sample": 16,
"nal_unit_length_field": 4
},
"video_ext": "ismv",
"audio_ext": "none",
"vbr": 1493,
}, {
"format_id": "video-4482",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"ext": "ismv",
"width": 1688,
"height": 720,
"tbr": 4482,
"vcodec": "AVC1",
"acodec": "none",
"protocol": "ism",
"_download_params": {
"stream_type": "video",
"duration": 8880746666,
"timescale": 10000000,
"width": 1688,
"height": 720,
"fourcc": "AVC1",
"codec_private_data": "00000001674D401FDA01A816F97FFC1ABC1AB440000003004000000C03C60CA80000000168EF32C8",
"channels": 2,
"bits_per_sample": 16,
"nal_unit_length_field": 4
},
"video_ext": "ismv",
"audio_ext": "none",
"vbr": 4482,
}],
{
"eng": [
{
"ext": "ismt",
"protocol": "ism",
"url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"manifest_url": "https://sdn-global-streaming-cache-3qsdn.akamaized.net/stream/3144/files/17/07/672975/3144-kZT4LWMQw6Rh7Kpd.ism/Manifest",
"_download_params": {
"stream_type": "text",
"duration": 8880746666,
"timescale": 10000000,
"fourcc": "TTML",
"codec_private_data": ""
}
}
]
},
),
]
for ism_file, ism_url, expected_formats, expected_subtitles in _TEST_CASES:
with io.open('./test/testdata/ism/%s.Manifest' % ism_file,
mode='r', encoding='utf-8') as f:
formats, subtitles = self.ie._parse_ism_formats_and_subtitles(
compat_etree_fromstring(f.read().encode('utf-8')), ism_url=ism_url)
self.ie._sort_formats(formats)
expect_value(self, formats, expected_formats, None)
expect_value(self, subtitles, expected_subtitles, None)
def test_parse_f4m_formats(self):
_TEST_CASES = [

988
test/testdata/ism/sintel.Manifest vendored Normal file
View File

@@ -0,0 +1,988 @@
<?xml version="1.0" encoding="utf-8"?>
<!-- Created with Unified Streaming Platform (version=1.10.18-20255) -->
<SmoothStreamingMedia
MajorVersion="2"
MinorVersion="0"
TimeScale="10000000"
Duration="8880746666">
<StreamIndex
Type="audio"
QualityLevels="1"
TimeScale="10000000"
Name="audio"
Chunks="445"
Url="QualityLevels({bitrate})/Fragments(audio={start time})">
<QualityLevel
Index="0"
Bitrate="128001"
CodecPrivateData="1190"
SamplingRate="48000"
Channels="2"
BitsPerSample="16"
PacketSize="4"
AudioTag="255"
FourCC="AACL" />
<c t="0" d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="20053333" />
<c d="20053333" />
<c d="20053334" />
<c d="19840000" />
<c d="746666" />
</StreamIndex>
<StreamIndex
Type="text"
QualityLevels="1"
TimeScale="10000000"
Language="eng"
Subtype="CAPT"
Name="textstream_eng"
Chunks="11"
Url="QualityLevels({bitrate})/Fragments(textstream_eng={start time})">
<QualityLevel
Index="0"
Bitrate="1000"
CodecPrivateData=""
FourCC="TTML" />
<c t="0" d="600000000" />
<c d="600000000" />
<c d="600000000" />
<c d="600000000" />
<c d="600000000" />
<c d="600000000" />
<c d="600000000" />
<c d="600000000" />
<c d="600000000" />
<c d="600000000" />
<c d="240000000" />
</StreamIndex>
<StreamIndex
Type="video"
QualityLevels="5"
TimeScale="10000000"
Name="video"
Chunks="444"
Url="QualityLevels({bitrate})/Fragments(video={start time})"
MaxWidth="1688"
MaxHeight="720"
DisplayWidth="1689"
DisplayHeight="720">
<QualityLevel
Index="0"
Bitrate="100000"
CodecPrivateData="00000001674D401FDA0544EFFC2D002CBC40000003004000000C03C60CA80000000168EF32C8"
MaxWidth="336"
MaxHeight="144"
FourCC="AVC1" />
<QualityLevel
Index="1"
Bitrate="326000"
CodecPrivateData="00000001674D401FDA0241FE23FFC3BC83BA44000003000400000300C03C60CA800000000168EF32C8"
MaxWidth="562"
MaxHeight="240"
FourCC="AVC1" />
<QualityLevel
Index="2"
Bitrate="698000"
CodecPrivateData="00000001674D401FDA0350BFB97FF06AF06AD1000003000100000300300F1832A00000000168EF32C8"
MaxWidth="844"
MaxHeight="360"
FourCC="AVC1" />
<QualityLevel
Index="3"
Bitrate="1493000"
CodecPrivateData="00000001674D401FDA011C3DE6FFF0D890D871000003000100000300300F1832A00000000168EF32C8"
MaxWidth="1126"
MaxHeight="480"
FourCC="AVC1" />
<QualityLevel
Index="4"
Bitrate="4482000"
CodecPrivateData="00000001674D401FDA01A816F97FFC1ABC1AB440000003004000000C03C60CA80000000168EF32C8"
MaxWidth="1688"
MaxHeight="720"
FourCC="AVC1" />
<c t="0" d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
<c d="20000000" />
</StreamIndex>
</SmoothStreamingMedia>

38
test/testdata/m3u8/bipbop_16x9.m3u8 vendored Normal file
View File

@@ -0,0 +1,38 @@
#EXTM3U
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="bipbop_audio",LANGUAGE="eng",NAME="BipBop Audio 1",AUTOSELECT=YES,DEFAULT=YES
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="bipbop_audio",LANGUAGE="eng",NAME="BipBop Audio 2",AUTOSELECT=NO,DEFAULT=NO,URI="alternate_audio_aac/prog_index.m3u8"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="English",DEFAULT=YES,AUTOSELECT=YES,FORCED=NO,LANGUAGE="en",CHARACTERISTICS="public.accessibility.transcribes-spoken-dialog, public.accessibility.describes-music-and-sound",URI="subtitles/eng/prog_index.m3u8"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="English (Forced)",DEFAULT=NO,AUTOSELECT=NO,FORCED=YES,LANGUAGE="en",URI="subtitles/eng_forced/prog_index.m3u8"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="Français",DEFAULT=NO,AUTOSELECT=YES,FORCED=NO,LANGUAGE="fr",CHARACTERISTICS="public.accessibility.transcribes-spoken-dialog, public.accessibility.describes-music-and-sound",URI="subtitles/fra/prog_index.m3u8"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="Français (Forced)",DEFAULT=NO,AUTOSELECT=NO,FORCED=YES,LANGUAGE="fr",URI="subtitles/fra_forced/prog_index.m3u8"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="Español",DEFAULT=NO,AUTOSELECT=YES,FORCED=NO,LANGUAGE="es",CHARACTERISTICS="public.accessibility.transcribes-spoken-dialog, public.accessibility.describes-music-and-sound",URI="subtitles/spa/prog_index.m3u8"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="Español (Forced)",DEFAULT=NO,AUTOSELECT=NO,FORCED=YES,LANGUAGE="es",URI="subtitles/spa_forced/prog_index.m3u8"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="日本語",DEFAULT=NO,AUTOSELECT=YES,FORCED=NO,LANGUAGE="ja",CHARACTERISTICS="public.accessibility.transcribes-spoken-dialog, public.accessibility.describes-music-and-sound",URI="subtitles/jpn/prog_index.m3u8"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="日本語 (Forced)",DEFAULT=NO,AUTOSELECT=NO,FORCED=YES,LANGUAGE="ja",URI="subtitles/jpn_forced/prog_index.m3u8"
#EXT-X-STREAM-INF:BANDWIDTH=263851,CODECS="mp4a.40.2, avc1.4d400d",RESOLUTION=416x234,AUDIO="bipbop_audio",SUBTITLES="subs"
gear1/prog_index.m3u8
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=28451,CODECS="avc1.4d400d",URI="gear1/iframe_index.m3u8"
#EXT-X-STREAM-INF:BANDWIDTH=577610,CODECS="mp4a.40.2, avc1.4d401e",RESOLUTION=640x360,AUDIO="bipbop_audio",SUBTITLES="subs"
gear2/prog_index.m3u8
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=181534,CODECS="avc1.4d401e",URI="gear2/iframe_index.m3u8"
#EXT-X-STREAM-INF:BANDWIDTH=915905,CODECS="mp4a.40.2, avc1.4d401f",RESOLUTION=960x540,AUDIO="bipbop_audio",SUBTITLES="subs"
gear3/prog_index.m3u8
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=297056,CODECS="avc1.4d401f",URI="gear3/iframe_index.m3u8"
#EXT-X-STREAM-INF:BANDWIDTH=1030138,CODECS="mp4a.40.2, avc1.4d401f",RESOLUTION=1280x720,AUDIO="bipbop_audio",SUBTITLES="subs"
gear4/prog_index.m3u8
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=339492,CODECS="avc1.4d401f",URI="gear4/iframe_index.m3u8"
#EXT-X-STREAM-INF:BANDWIDTH=1924009,CODECS="mp4a.40.2, avc1.4d401f",RESOLUTION=1920x1080,AUDIO="bipbop_audio",SUBTITLES="subs"
gear5/prog_index.m3u8
#EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=669554,CODECS="avc1.4d401f",URI="gear5/iframe_index.m3u8"
#EXT-X-STREAM-INF:BANDWIDTH=41457,CODECS="mp4a.40.2",AUDIO="bipbop_audio",SUBTITLES="subs"
gear0/prog_index.m3u8

351
test/testdata/mpd/subtitles.mpd vendored Normal file
View File

@@ -0,0 +1,351 @@
<?xml version="1.0" encoding="utf-8"?>
<!-- Created with Unified Streaming Platform (version=1.10.18-20255) -->
<MPD
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="urn:mpeg:dash:schema:mpd:2011"
xsi:schemaLocation="urn:mpeg:dash:schema:mpd:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"
type="static"
mediaPresentationDuration="PT14M48S"
maxSegmentDuration="PT1M"
minBufferTime="PT10S"
profiles="urn:mpeg:dash:profile:isoff-live:2011">
<Period
id="1"
duration="PT14M48S">
<BaseURL>dash/</BaseURL>
<AdaptationSet
id="1"
group="1"
contentType="audio"
segmentAlignment="true"
audioSamplingRate="48000"
mimeType="audio/mp4"
codecs="mp4a.40.2"
startWithSAP="1">
<AudioChannelConfiguration
schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011"
value="2" />
<Role schemeIdUri="urn:mpeg:dash:role:2011" value="main" />
<SegmentTemplate
timescale="48000"
initialization="3144-kZT4LWMQw6Rh7Kpd-$RepresentationID$.dash"
media="3144-kZT4LWMQw6Rh7Kpd-$RepresentationID$-$Time$.dash">
<SegmentTimeline>
<S t="0" d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="96256" r="2" />
<S d="95232" />
<S d="3584" />
</SegmentTimeline>
</SegmentTemplate>
<Representation
id="audio=128001"
bandwidth="128001">
</Representation>
</AdaptationSet>
<AdaptationSet
id="2"
group="3"
contentType="text"
lang="en"
mimeType="application/mp4"
codecs="stpp"
startWithSAP="1">
<Role schemeIdUri="urn:mpeg:dash:role:2011" value="subtitle" />
<SegmentTemplate
timescale="1000"
initialization="3144-kZT4LWMQw6Rh7Kpd-$RepresentationID$.dash"
media="3144-kZT4LWMQw6Rh7Kpd-$RepresentationID$-$Time$.dash">
<SegmentTimeline>
<S t="0" d="60000" r="9" />
<S d="24000" />
</SegmentTimeline>
</SegmentTemplate>
<Representation
id="textstream_eng=1000"
bandwidth="1000">
</Representation>
</AdaptationSet>
<AdaptationSet
id="3"
group="2"
contentType="video"
par="960:409"
minBandwidth="100000"
maxBandwidth="4482000"
maxWidth="1689"
maxHeight="720"
segmentAlignment="true"
mimeType="video/mp4"
codecs="avc1.4D401F"
startWithSAP="1">
<Role schemeIdUri="urn:mpeg:dash:role:2011" value="main" />
<SegmentTemplate
timescale="12288"
initialization="3144-kZT4LWMQw6Rh7Kpd-$RepresentationID$.dash"
media="3144-kZT4LWMQw6Rh7Kpd-$RepresentationID$-$Time$.dash">
<SegmentTimeline>
<S t="0" d="24576" r="443" />
</SegmentTimeline>
</SegmentTemplate>
<Representation
id="video=100000"
bandwidth="100000"
width="336"
height="144"
sar="2880:2863"
scanType="progressive">
</Representation>
<Representation
id="video=326000"
bandwidth="326000"
width="562"
height="240"
sar="115200:114929"
scanType="progressive">
</Representation>
<Representation
id="video=698000"
bandwidth="698000"
width="844"
height="360"
sar="86400:86299"
scanType="progressive">
</Representation>
<Representation
id="video=1493000"
bandwidth="1493000"
width="1126"
height="480"
sar="230400:230267"
scanType="progressive">
</Representation>
<Representation
id="video=4482000"
bandwidth="4482000"
width="1688"
height="720"
sar="86400:86299"
scanType="progressive">
</Representation>
</AdaptationSet>
</Period>
</MPD>

View File

@@ -19,7 +19,6 @@ import platform
import re
import shutil
import subprocess
import socket
import sys
import time
import tokenize
@@ -33,7 +32,6 @@ from .compat import (
compat_basestring,
compat_cookiejar,
compat_get_terminal_size,
compat_http_client,
compat_kwargs,
compat_numeric_types,
compat_os_name,
@@ -77,6 +75,7 @@ from .utils import (
make_dir,
make_HTTPS_handler,
MaxDownloadsReached,
network_exceptions,
orderedSet,
PagedList,
parse_filesize,
@@ -85,6 +84,7 @@ from .utils import (
PostProcessingError,
preferredencoding,
prepend_extension,
random_uuidv4,
register_socks_protocols,
render_table,
replace_extension,
@@ -385,6 +385,10 @@ class YoutubeDL(object):
Use the native HLS downloader instead of ffmpeg/avconv
if True, otherwise use ffmpeg/avconv if False, otherwise
use downloader suggested by extractor if None.
compat_opts: Compatibility options. See "Differences in default behavior".
Note that only format-sort, format-spec, no-live-chat,
playlist-index, list-formats, no-youtube-channel-redirect
and no-youtube-unavailable-videos works when used via the API
The following parameters are not used by YoutubeDL itself, they are used by
the downloader (see yt_dlp/downloader/common.py):
@@ -462,39 +466,29 @@ class YoutubeDL(object):
}
self.params.update(params)
self.cache = Cache(self)
self.archive = set()
"""Preload the archive, if any is specified"""
def preload_download_archive(self):
fn = self.params.get('download_archive')
if fn is None:
return False
try:
with locked_file(fn, 'r', encoding='utf-8') as archive_file:
for line in archive_file:
self.archive.add(line.strip())
except IOError as ioe:
if ioe.errno != errno.ENOENT:
raise
return False
return True
if sys.version_info < (3, 6):
self.report_warning(
'Support for Python version %d.%d have been deprecated and will break in future versions of yt-dlp! '
'Update to Python 3.6 or above' % sys.version_info[:2])
def check_deprecated(param, option, suggestion):
if self.params.get(param) is not None:
self.report_warning(
'%s is deprecated. Use %s instead.' % (option, suggestion))
self.report_warning('%s is deprecated. Use %s instead' % (option, suggestion))
return True
return False
if self.params.get('verbose'):
self.to_stdout('[debug] Loading archive file %r' % self.params.get('download_archive'))
preload_download_archive(self)
if check_deprecated('cn_verification_proxy', '--cn-verification-proxy', '--geo-verification-proxy'):
if self.params.get('geo_verification_proxy') is None:
self.params['geo_verification_proxy'] = self.params['cn_verification_proxy']
check_deprecated('autonumber', '--auto-number', '-o "%(autonumber)s-%(title)s.%(ext)s"')
check_deprecated('usetitle', '--title', '-o "%(title)s-%(id)s.%(ext)s"')
check_deprecated('useid', '--id', '-o "%(id)s.%(ext)s"')
for msg in self.params.get('warnings', []):
self.report_warning(msg)
if self.params.get('final_ext'):
if self.params.get('merge_output_format'):
self.report_warning('--merge-output-format will be ignored since --remux-video or --recode-video is given')
@@ -503,10 +497,6 @@ class YoutubeDL(object):
if 'overwrites' in self.params and self.params['overwrites'] is None:
del self.params['overwrites']
check_deprecated('autonumber_size', '--autonumber-size', 'output template with %(autonumber)0Nd, where N in the number of digits')
check_deprecated('autonumber', '--auto-number', '-o "%(autonumber)s-%(title)s.%(ext)s"')
check_deprecated('usetitle', '--title', '-o "%(title)s-%(id)s.%(ext)s"')
if params.get('bidi_workaround', False):
try:
import pty
@@ -548,6 +538,25 @@ class YoutubeDL(object):
self._setup_opener()
"""Preload the archive, if any is specified"""
def preload_download_archive(fn):
if fn is None:
return False
if self.params.get('verbose'):
self._write_string('[debug] Loading archive file %r\n' % fn)
try:
with locked_file(fn, 'r', encoding='utf-8') as archive_file:
for line in archive_file:
self.archive.add(line.strip())
except IOError as ioe:
if ioe.errno != errno.ENOENT:
raise
return False
return True
self.archive = set()
preload_download_archive(self.params.get('download_archive'))
if auto_init:
self.print_debug_header()
self.add_default_info_extractors()
@@ -642,16 +651,18 @@ class YoutubeDL(object):
def to_screen(self, message, skip_eol=False):
"""Print message to stdout if not in quiet mode."""
return self.to_stdout(message, skip_eol, check_quiet=True)
return self.to_stdout(
message, skip_eol,
quiet=self.params.get('quiet', False))
def _write_string(self, s, out=None):
write_string(s, out=out, encoding=self.params.get('encoding'))
def to_stdout(self, message, skip_eol=False, check_quiet=False):
def to_stdout(self, message, skip_eol=False, quiet=False):
"""Print message to stdout if not in quiet mode."""
if self.params.get('logger'):
self.params['logger'].debug(message)
elif not check_quiet or not self.params.get('quiet', False):
elif not quiet:
message = self._bidi_workaround(message)
terminator = ['\n', ''][skip_eol]
output = message + terminator
@@ -826,7 +837,7 @@ class YoutubeDL(object):
# For fields playlist_index and autonumber convert all occurrences
# of %(field)s to %(field)0Nd for backward compatibility
field_size_compat_map = {
'playlist_index': len(str(template_dict.get('n_entries', na))),
'playlist_index': len(str(template_dict.get('_last_playlist_index') or '')),
'autonumber': autonumber_size,
}
FIELD_SIZE_COMPAT_RE = r'(?<!%)%\((?P<field>autonumber|playlist_index)\)s'
@@ -841,29 +852,67 @@ class YoutubeDL(object):
if sanitize is None:
sanitize = lambda k, v: v
# Internal Formatting = name.key1.key2+number>strf
INTERNAL_FORMAT_RE = FORMAT_RE.format(
r'''(?P<final_key>
(?P<fields>\w+(?:\.[-\w]+)*)
(?:\+(?P<add>-?\d+(?:\.\d+)?))?
(?:>(?P<strf_format>.+?))?
)''')
for mobj in re.finditer(INTERNAL_FORMAT_RE, outtmpl):
mobj = mobj.groupdict()
# Object traversal
fields = mobj['fields'].split('.')
final_key = mobj['final_key']
value = traverse_dict(template_dict, fields)
# Offset the value
if mobj['add']:
value = float_or_none(value)
if value is not None:
value = value + float(mobj['add'])
# Datetime formatting
if mobj['strf_format']:
value = strftime_or_none(value, mobj['strf_format'])
if mobj['type'] in 'crs' and value is not None: # string
value = sanitize('%{}'.format(mobj['type']) % fields[-1], value)
EXTERNAL_FORMAT_RE = FORMAT_RE.format('(?P<key>[^)]*)')
# Field is of the form key1.key2...
# where keys (except first) can be string, int or slice
FIELD_RE = r'\w+(?:\.(?:\w+|[-\d]*(?::[-\d]*){0,2}))*'
INTERNAL_FORMAT_RE = re.compile(r'''(?x)
(?P<negate>-)?
(?P<fields>{0})
(?P<maths>(?:[-+]-?(?:\d+(?:\.\d+)?|{0}))*)
(?:>(?P<strf_format>.+?))?
(?:\|(?P<default>.*?))?
$'''.format(FIELD_RE))
MATH_OPERATORS_RE = re.compile(r'(?<![-+])([-+])')
MATH_FUNCTIONS = {
'+': float.__add__,
'-': float.__sub__,
}
for outer_mobj in re.finditer(EXTERNAL_FORMAT_RE, outtmpl):
final_key = outer_mobj.group('key')
str_type = outer_mobj.group('type')
value = None
mobj = re.match(INTERNAL_FORMAT_RE, final_key)
if mobj is not None:
mobj = mobj.groupdict()
# Object traversal
fields = mobj['fields'].split('.')
value = traverse_dict(template_dict, fields)
# Negative
if mobj['negate']:
value = float_or_none(value)
if value is not None:
value *= -1
# Do maths
if mobj['maths']:
value = float_or_none(value)
operator = None
for item in MATH_OPERATORS_RE.split(mobj['maths'])[1:]:
if item == '':
value = None
if value is None:
break
if operator:
item, multiplier = (item[1:], -1) if item[0] == '-' else (item, 1)
offset = float_or_none(item)
if offset is None:
offset = float_or_none(traverse_dict(template_dict, item.split('.')))
try:
value = operator(value, multiplier * offset)
except (TypeError, ZeroDivisionError):
value = None
operator = None
else:
operator = MATH_FUNCTIONS[item]
# Datetime formatting
if mobj['strf_format']:
value = strftime_or_none(value, mobj['strf_format'])
# Set default
if value is None and mobj['default'] is not None:
value = mobj['default']
# Sanitize
if str_type in 'crs' and value is not None: # string
value = sanitize('%{}'.format(str_type) % fields[-1], value)
else: # numeric
numeric_fields.append(final_key)
value = float_or_none(value)
@@ -1013,13 +1062,22 @@ class YoutubeDL(object):
for key, value in extra_info.items():
info_dict.setdefault(key, value)
def extract_info(self, url, download=True, ie_key=None, info_dict=None, extra_info={},
def extract_info(self, url, download=True, ie_key=None, extra_info={},
process=True, force_generic_extractor=False):
'''
Returns a list with a dictionary for each video we find.
If 'download', also downloads the videos.
extra_info is a dict containing the extra values to add to each result
'''
"""
Return a list with a dictionary for each video extracted.
Arguments:
url -- URL to extract
Keyword arguments:
download -- whether to download videos during extraction
ie_key -- extractor key hint
extra_info -- dictionary containing the extra values to add to each result
process -- whether to resolve all unresolved references (URLs, playlist items),
must be True for download to work.
force_generic_extractor -- force using the generic extractor
"""
if not ie_key and force_generic_extractor:
ie_key = 'Generic'
@@ -1049,7 +1107,7 @@ class YoutubeDL(object):
self.to_screen("[%s] %s: has already been recorded in archive" % (
ie_key, temp_id))
break
return self.__extract_info(url, ie, download, extra_info, process, info_dict)
return self.__extract_info(url, ie, download, extra_info, process)
else:
self.report_error('no suitable InfoExtractor for URL %s' % url)
@@ -1076,7 +1134,7 @@ class YoutubeDL(object):
return wrapper
@__handle_extraction_exceptions
def __extract_info(self, url, ie, download, extra_info, process, info_dict):
def __extract_info(self, url, ie, download, extra_info, process):
ie_result = ie.extract(url)
if ie_result is None: # Finished already (backwards compatibility; listformats and friends should be moved here)
return
@@ -1086,11 +1144,6 @@ class YoutubeDL(object):
'_type': 'compat_list',
'entries': ie_result,
}
if info_dict:
if info_dict.get('id'):
ie_result['id'] = info_dict['id']
if info_dict.get('title'):
ie_result['title'] = info_dict['title']
self.add_default_extra_info(ie_result, ie, url)
if process:
return self.process_ie_result(ie_result, download, extra_info)
@@ -1130,7 +1183,7 @@ class YoutubeDL(object):
# We have to add extra_info to the results because it may be
# contained in a playlist
return self.extract_info(ie_result['url'],
download, info_dict=ie_result,
download,
ie_key=ie_result.get('ie_key'),
extra_info=extra_info)
elif result_type == 'url_transparent':
@@ -1300,7 +1353,7 @@ class YoutubeDL(object):
'playlist_title': ie_result.get('title'),
'playlist_uploader': ie_result.get('uploader'),
'playlist_uploader_id': ie_result.get('uploader_id'),
'playlist_index': 0
'playlist_index': 0,
}
ie_copy.update(dict(ie_result))
@@ -1334,6 +1387,11 @@ class YoutubeDL(object):
self.report_error('Cannot write playlist description file ' + descfn)
return
# Save playlist_index before re-ordering
entries = [
((playlistitems[i - 1] if playlistitems else i), entry)
for i, entry in enumerate(entries, 1)]
if self.params.get('playlistreverse', False):
entries = entries[::-1]
if self.params.get('playlistrandom', False):
@@ -1344,7 +1402,10 @@ class YoutubeDL(object):
self.to_screen('[%s] playlist %s: %s' % (ie_result['extractor'], playlist, msg))
failures = 0
max_failures = self.params.get('skip_playlist_after_errors') or float('inf')
for i, entry in enumerate(entries, 1):
for i, entry_tuple in enumerate(entries, 1):
playlist_index, entry = entry_tuple
if 'playlist_index' in self.params.get('compat_options', []):
playlist_index = playlistitems[i - 1] if playlistitems else i
self.to_screen('[download] Downloading video %s of %s' % (i, n_entries))
# This __x_forwarded_for_ip thing is a bit ugly but requires
# minimal changes
@@ -1352,12 +1413,14 @@ class YoutubeDL(object):
entry['__x_forwarded_for_ip'] = x_forwarded_for
extra = {
'n_entries': n_entries,
'_last_playlist_index': max(playlistitems) if playlistitems else (playlistend or n_entries),
'playlist_index': playlist_index,
'playlist_autonumber': i,
'playlist': playlist,
'playlist_id': ie_result.get('id'),
'playlist_title': ie_result.get('title'),
'playlist_uploader': ie_result.get('uploader'),
'playlist_uploader_id': ie_result.get('uploader_id'),
'playlist_index': playlistitems[i - 1] if playlistitems else i,
'extractor': ie_result['extractor'],
'webpage_url': ie_result['webpage_url'],
'webpage_url_basename': url_basename(ie_result['webpage_url']),
@@ -1461,12 +1524,14 @@ class YoutubeDL(object):
not can_merge()
or info_dict.get('is_live', False)
or self.outtmpl_dict['default'] == '-'))
compat = (
prefer_best
or self.params.get('allow_multiple_audio_streams', False)
or 'format-spec' in self.params.get('compat_opts', []))
return (
'best/bestvideo+bestaudio'
if prefer_best
else 'bestvideo*+bestaudio/best'
if not self.params.get('allow_multiple_audio_streams', False)
'best/bestvideo+bestaudio' if prefer_best
else 'bestvideo*+bestaudio/best' if not compat
else 'bestvideo+bestaudio/best')
def build_format_selector(self, format_spec):
@@ -1485,6 +1550,8 @@ class YoutubeDL(object):
allow_multiple_streams = {'audio': self.params.get('allow_multiple_audio_streams', False),
'video': self.params.get('allow_multiple_video_streams', False)}
check_formats = self.params.get('check_formats')
def _parse_filter(tokens):
filter_parts = []
for type, string, start, _, _ in tokens:
@@ -1642,6 +1709,22 @@ class YoutubeDL(object):
return new_dict
def _check_formats(formats):
for f in formats:
self.to_screen('[info] Testing format %s' % f['format_id'])
paths = self.params.get('paths', {})
temp_file = os.path.join(
expand_path(paths.get('home', '').strip()),
expand_path(paths.get('temp', '').strip()),
'ytdl.%s.f%s.check-format' % (random_uuidv4(), f['format_id']))
dl, _ = self.dl(temp_file, f, test=True)
if os.path.exists(temp_file):
os.remove(temp_file)
if dl:
yield f
else:
self.to_screen('[info] Unable to download format %s. Skipping...' % f['format_id'])
def _build_selector_function(selector):
if isinstance(selector, list): # ,
fs = [_build_selector_function(s) for s in selector]
@@ -1666,18 +1749,19 @@ class YoutubeDL(object):
return []
elif selector.type == SINGLE: # atom
format_spec = (selector.selector or 'best').lower()
format_spec = selector.selector or 'best'
# TODO: Add allvideo, allaudio etc by generalizing the code with best/worst selector
if format_spec == 'all':
def selector_function(ctx):
formats = list(ctx['formats'])
if formats:
for f in formats:
yield f
if check_formats:
formats = _check_formats(formats)
for f in formats:
yield f
elif format_spec == 'mergeall':
def selector_function(ctx):
formats = list(ctx['formats'])
formats = list(_check_formats(ctx['formats']))
if not formats:
return
merged_format = formats[-1]
@@ -1686,13 +1770,13 @@ class YoutubeDL(object):
yield merged_format
else:
format_fallback = False
format_fallback, format_reverse, format_idx = False, True, 1
mobj = re.match(
r'(?P<bw>best|worst|b|w)(?P<type>video|audio|v|a)?(?P<mod>\*)?(?:\.(?P<n>[1-9]\d*))?$',
format_spec)
if mobj is not None:
format_idx = int_or_none(mobj.group('n'), default=1)
format_idx = format_idx - 1 if mobj.group('bw')[0] == 'w' else -format_idx
format_reverse = mobj.group('bw')[0] == 'b'
format_type = (mobj.group('type') or [None])[0]
not_format_type = {'v': 'a', 'a': 'v'}.get(format_type)
format_modified = mobj.group('mod') is not None
@@ -1707,7 +1791,6 @@ class YoutubeDL(object):
if not format_modified # b, w
else None) # b*, w*
else:
format_idx = -1
filter_f = ((lambda f: f.get('ext') == format_spec)
if format_spec in ['mp4', 'flv', 'webm', '3gp', 'm4a', 'mp3', 'ogg', 'aac', 'wav'] # extension
else (lambda f: f.get('format_id') == format_spec)) # id
@@ -1717,16 +1800,18 @@ class YoutubeDL(object):
if not formats:
return
matches = list(filter(filter_f, formats)) if filter_f is not None else formats
n = len(matches)
if -n <= format_idx < n:
yield matches[format_idx]
elif format_fallback and ctx['incomplete_formats']:
if format_fallback and ctx['incomplete_formats'] and not matches:
# for extractors with incomplete formats (audio only (soundcloud)
# or video only (imgur)) best/worst will fallback to
# best/worst {video,audio}-only format
n = len(formats)
if -n <= format_idx < n:
yield formats[format_idx]
matches = formats
if format_reverse:
matches = matches[::-1]
if check_formats:
matches = list(itertools.islice(_check_formats(matches), format_idx))
n = len(matches)
if -n <= format_idx - 1 < n:
yield matches[format_idx - 1]
elif selector.type == MERGE: # +
selector_1, selector_2 = map(_build_selector_function, selector.selector)
@@ -2143,6 +2228,34 @@ class YoutubeDL(object):
self.post_extract(info_dict)
self.to_stdout(json.dumps(info_dict, default=repr))
def dl(self, name, info, subtitle=False, test=False):
if test:
verbose = self.params.get('verbose')
params = {
'test': True,
'quiet': not verbose,
'verbose': verbose,
'noprogress': not verbose,
'nopart': True,
'skip_unavailable_fragments': False,
'keep_fragments': False,
'overwrites': True,
'_no_ytdl_file': True,
}
else:
params = self.params
fd = get_suitable_downloader(info, params)(self, params)
if not test:
for ph in self._progress_hooks:
fd.add_progress_hook(ph)
if self.params.get('verbose'):
self.to_screen('[debug] Invoking downloader on %r' % info.get('url'))
new_info = dict(info)
if new_info.get('http_headers') is None:
new_info['http_headers'] = self._calc_headers(new_info)
return fd.download(name, new_info, subtitle)
def process_info(self, info_dict):
"""Process a single resolved IE result."""
@@ -2228,17 +2341,6 @@ class YoutubeDL(object):
self.report_error('Cannot write annotations file: ' + annofn)
return
def dl(name, info, subtitle=False):
fd = get_suitable_downloader(info, self.params)(self, self.params)
for ph in self._progress_hooks:
fd.add_progress_hook(ph)
if self.params.get('verbose'):
self.to_screen('[debug] Invoking downloader on %r' % info.get('url'))
new_info = dict(info)
if new_info.get('http_headers') is None:
new_info['http_headers'] = self._calc_headers(new_info)
return fd.download(name, new_info, subtitle)
subtitles_are_requested = any([self.params.get('writesubtitles', False),
self.params.get('writeautomaticsub')])
@@ -2271,10 +2373,10 @@ class YoutubeDL(object):
return
else:
try:
dl(sub_filename, sub_info.copy(), subtitle=True)
self.dl(sub_filename, sub_info.copy(), subtitle=True)
sub_info['filepath'] = sub_filename
files_to_move[sub_filename] = sub_filename_final
except (ExtractorError, IOError, OSError, ValueError, compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
except tuple([ExtractorError, IOError, OSError, ValueError] + network_exceptions) as err:
self.report_warning('Unable to download subtitle for "%s": %s' %
(sub_lang, error_to_compat_str(err)))
continue
@@ -2457,7 +2559,7 @@ class YoutubeDL(object):
if not self._ensure_dir_exists(fname):
return
downloaded.append(fname)
partial_success, real_download = dl(fname, new_info)
partial_success, real_download = self.dl(fname, new_info)
info_dict['__real_download'] = info_dict['__real_download'] or real_download
success = success and partial_success
if merger.available and not self.params.get('allow_unplayable_formats'):
@@ -2472,13 +2574,13 @@ class YoutubeDL(object):
# Just a single file
dl_filename = existing_file(full_filename, temp_filename)
if dl_filename is None:
success, real_download = dl(temp_filename, info_dict)
success, real_download = self.dl(temp_filename, info_dict)
info_dict['__real_download'] = real_download
dl_filename = dl_filename or temp_filename
info_dict['__finaldir'] = os.path.dirname(os.path.abspath(encodeFilename(full_filename)))
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
except network_exceptions as err:
self.report_error('unable to download video data: %s' % error_to_compat_str(err))
return
except (OSError, IOError) as err:
@@ -2818,7 +2920,9 @@ class YoutubeDL(object):
def list_formats(self, info_dict):
formats = info_dict.get('formats', [info_dict])
new_format = self.params.get('listformats_table', False)
new_format = (
'list-formats' not in self.params.get('compat_opts', [])
and self.params.get('list_formats_as_table', True) is not False)
if new_format:
table = [
[
@@ -2919,6 +3023,9 @@ class YoutubeDL(object):
if _PLUGIN_CLASSES:
self._write_string(
'[debug] Plugin Extractors: %s\n' % [ie.ie_key() for ie in _PLUGIN_CLASSES])
if self.params.get('compat_opts'):
self._write_string(
'[debug] Compatibility options: %s\n' % ', '.join(self.params.get('compat_opts')))
try:
sp = subprocess.Popen(
['git', 'rev-parse', '--short', 'HEAD'],
@@ -3073,7 +3180,7 @@ class YoutubeDL(object):
ret.append(suffix + thumb_ext)
self.to_screen('[%s] %s: Writing thumbnail %sto: %s' %
(info_dict['extractor'], info_dict['id'], thumb_display_id, thumb_filename))
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
except network_exceptions as err:
self.report_warning('Unable to download thumbnail "%s": %s' %
(t['url'], error_to_compat_str(err)))
if ret and not write_all:

View File

@@ -60,6 +60,7 @@ def _real_main(argv=None):
setproctitle('yt-dlp')
parser, opts, args = parseOpts(argv)
warnings = []
# Set user agent
if opts.user_agent is not None:
@@ -128,16 +129,12 @@ def _real_main(argv=None):
parser.error('account username missing\n')
if opts.ap_password is not None and opts.ap_username is None:
parser.error('TV Provider account username missing\n')
if opts.outtmpl is not None and (opts.usetitle or opts.autonumber or opts.useid):
parser.error('using output template conflicts with using title, video ID or auto number')
if opts.autonumber_size is not None:
if opts.autonumber_size <= 0:
parser.error('auto number size must be positive')
if opts.autonumber_start is not None:
if opts.autonumber_start < 0:
parser.error('auto number start must be positive or 0')
if opts.usetitle and opts.useid:
parser.error('using title conflicts with using video ID')
if opts.username is not None and opts.password is None:
opts.password = compat_getpass('Type account password and press [Return]: ')
if opts.ap_username is not None and opts.ap_password is None:
@@ -177,8 +174,7 @@ def _real_main(argv=None):
parser.error('requests sleep interval must be positive or 0')
if opts.ap_mso and opts.ap_mso not in MSO_INFO:
parser.error('Unsupported TV Provider, use --ap-list-mso to get a list of supported TV Providers')
if opts.overwrites:
# --yes-overwrites implies --no-continue
if opts.overwrites: # --yes-overwrites implies --no-continue
opts.continue_dl = False
if opts.concurrent_fragment_downloads <= 0:
raise ValueError('Concurrent fragments must be positive')
@@ -239,21 +235,75 @@ def _real_main(argv=None):
else:
date = DateRange(opts.dateafter, opts.datebefore)
# Do not download videos when there are audio-only formats
def parse_compat_opts():
parsed_compat_opts, compat_opts = set(), opts.compat_opts[::-1]
while compat_opts:
actual_opt = opt = compat_opts.pop().lower()
if opt == 'youtube-dl':
compat_opts.extend(['-multistreams', 'all'])
elif opt == 'youtube-dlc':
compat_opts.extend(['-no-youtube-channel-redirect', '-no-live-chat', 'all'])
elif opt == 'all':
parsed_compat_opts.update(all_compat_opts)
elif opt == '-all':
parsed_compat_opts = set()
else:
if opt[0] == '-':
opt = opt[1:]
parsed_compat_opts.discard(opt)
else:
parsed_compat_opts.update([opt])
if opt not in all_compat_opts:
parser.error('Invalid compatibility option %s' % actual_opt)
return parsed_compat_opts
all_compat_opts = [
'filename', 'format-sort', 'abort-on-error', 'format-spec', 'multistreams',
'no-playlist-metafiles', 'no-live-chat', 'playlist-index', 'list-formats',
'no-youtube-channel-redirect', 'no-youtube-unavailable-videos',
]
compat_opts = parse_compat_opts()
def _unused_compat_opt(name):
if name not in compat_opts:
return False
compat_opts.discard(name)
compat_opts.update(['*%s' % name])
return True
def set_default_compat(compat_name, opt_name, default=True, remove_compat=False):
attr = getattr(opts, opt_name)
if compat_name in compat_opts:
if attr is None:
setattr(opts, opt_name, not default)
return True
else:
if remove_compat:
_unused_compat_opt(compat_name)
return False
elif attr is None:
setattr(opts, opt_name, default)
return None
set_default_compat('abort-on-error', 'ignoreerrors')
set_default_compat('no-playlist-metafiles', 'allow_playlist_files')
if 'format-sort' in compat_opts:
opts.format_sort.extend(InfoExtractor.FormatSort.ytdl_default)
_video_multistreams_set = set_default_compat('multistreams', 'allow_multiple_video_streams', False, remove_compat=False)
_audio_multistreams_set = set_default_compat('multistreams', 'allow_multiple_audio_streams', False, remove_compat=False)
if _video_multistreams_set is False and _audio_multistreams_set is False:
_unused_compat_opt('multistreams')
outtmpl_default = opts.outtmpl.get('default')
if 'filename' in compat_opts:
if outtmpl_default is None:
outtmpl_default = '%(title)s.%(id)s.%(ext)s'
opts.outtmpl.update({'default': outtmpl_default})
else:
_unused_compat_opt('filename')
if opts.extractaudio and not opts.keepvideo and opts.format is None:
opts.format = 'bestaudio/best'
outtmpl = opts.outtmpl
if not outtmpl:
outtmpl = {'default': (
'%(title)s-%(id)s-%(format)s.%(ext)s' if opts.format == '-1' and opts.usetitle
else '%(id)s-%(format)s.%(ext)s' if opts.format == '-1'
else '%(autonumber)s-%(title)s-%(id)s.%(ext)s' if opts.usetitle and opts.autonumber
else '%(title)s-%(id)s.%(ext)s' if opts.usetitle
else '%(id)s.%(ext)s' if opts.useid
else '%(autonumber)s-%(id)s.%(ext)s' if opts.autonumber
else None)}
outtmpl_default = outtmpl.get('default')
if outtmpl_default is not None and not os.path.splitext(outtmpl_default)[1] and opts.extractaudio:
parser.error('Cannot download a video and extract audio into the same'
' file! Use "{0}.%(ext)s" instead of "{0}" as the output'
@@ -281,7 +331,7 @@ def _real_main(argv=None):
opts.writeinfojson = True
def report_conflict(arg1, arg2):
write_string('WARNING: %s is ignored since %s was given\n' % (arg2, arg1), out=sys.stderr)
warnings.append('%s is ignored since %s was given' % (arg2, arg1))
if opts.remuxvideo and opts.recodevideo:
report_conflict('--recode-video', '--remux-video')
@@ -419,11 +469,10 @@ def _real_main(argv=None):
})
def report_args_compat(arg, name):
write_string(
'WARNING: %s given without specifying name. The arguments will be given to all %s\n' % (arg, name),
out=sys.stderr)
warnings.append('%s given without specifying name. The arguments will be given to all %s' % (arg, name))
if 'default' in opts.external_downloader_args:
report_args_compat('--external-downloader-args', 'external downloaders')
report_args_compat('--downloader-args', 'external downloaders')
if 'default-compat' in opts.postprocessor_args and 'default' not in opts.postprocessor_args:
report_args_compat('--post-processor-args', 'post-processors')
@@ -471,9 +520,10 @@ def _real_main(argv=None):
'format_sort_force': opts.format_sort_force,
'allow_multiple_video_streams': opts.allow_multiple_video_streams,
'allow_multiple_audio_streams': opts.allow_multiple_audio_streams,
'check_formats': opts.check_formats,
'listformats': opts.listformats,
'listformats_table': opts.listformats_table,
'outtmpl': outtmpl,
'outtmpl': opts.outtmpl,
'outtmpl_na_placeholder': opts.outtmpl_na_placeholder,
'paths': opts.paths,
'autonumber_size': opts.autonumber_size,
@@ -588,9 +638,12 @@ def _real_main(argv=None):
'geo_bypass': opts.geo_bypass,
'geo_bypass_country': opts.geo_bypass_country,
'geo_bypass_ip_block': opts.geo_bypass_ip_block,
'warnings': warnings,
'compat_opts': compat_opts,
# just for deprecation check
'autonumber': opts.autonumber if opts.autonumber is True else None,
'usetitle': opts.usetitle if opts.usetitle is True else None,
'autonumber': opts.autonumber or None,
'usetitle': opts.usetitle or None,
'useid': opts.useid or None,
}
with YoutubeDL(ydl_opts) as ydl:

View File

@@ -3018,10 +3018,24 @@ else:
return ctypes.WINFUNCTYPE(*args, **kwargs)
try:
compat_Pattern = re.Pattern
except AttributeError:
compat_Pattern = type(re.compile(''))
try:
compat_Match = re.Match
except AttributeError:
compat_Match = type(re.compile('').match(''))
__all__ = [
'compat_HTMLParseError',
'compat_HTMLParser',
'compat_HTTPError',
'compat_Match',
'compat_Pattern',
'compat_Struct',
'compat_b64decode',
'compat_basestring',

View File

@@ -31,6 +31,7 @@ from .external import (
PROTOCOL_MAP = {
'rtmp': RtmpFD,
'rtmp_ffmpeg': FFmpegFD,
'm3u8_native': HlsFD,
'm3u8': FFmpegFD,
'mms': RtspFD,
@@ -46,6 +47,7 @@ PROTOCOL_MAP = {
def shorten_protocol_name(proto, simplify=False):
short_protocol_names = {
'm3u8_native': 'm3u8_n',
'rtmp_ffmpeg': 'rtmp_f',
'http_dash_segments': 'dash',
'niconico_dmc': 'dmc',
}
@@ -54,6 +56,7 @@ def shorten_protocol_name(proto, simplify=False):
'https': 'http',
'ftps': 'ftp',
'm3u8_native': 'm3u8',
'rtmp_ffmpeg': 'rtmp',
'm3u8_frag_urls': 'm3u8',
'dash_frag_urls': 'dash',
})

View File

@@ -147,10 +147,10 @@ class FileDownloader(object):
return int(round(number * multiplier))
def to_screen(self, *args, **kargs):
self.ydl.to_screen(*args, **kargs)
self.ydl.to_stdout(*args, quiet=self.params.get('quiet'), **kargs)
def to_stderr(self, message):
self.ydl.to_screen(message)
self.ydl.to_stderr(message)
def to_console_title(self, message):
self.ydl.to_console_title(message)

View File

@@ -1,5 +1,6 @@
from __future__ import unicode_literals
import errno
try:
import concurrent.futures
can_threaded_download = True
@@ -112,12 +113,14 @@ class DashSegmentsFD(FragmentFD):
if count > fragment_retries:
if not fatal:
return False, frag_index
ctx['dest_stream'].close()
self.report_error('Giving up after %s fragment retries' % fragment_retries)
return False, frag_index
return frag_content, frag_index
def append_fragment(frag_content, frag_index):
fatal = frag_index == 1 or not skip_unavailable_fragments
if frag_content:
fragment_filename = '%s-Frag%d' % (ctx['tmpfilename'], frag_index)
try:
@@ -126,19 +129,24 @@ class DashSegmentsFD(FragmentFD):
file.close()
self._append_fragment(ctx, frag_content)
return True
except FileNotFoundError:
if skip_unavailable_fragments:
except EnvironmentError as ose:
if ose.errno != errno.ENOENT:
raise
# FileNotFoundError
if not fatal:
self.report_skip_fragment(frag_index)
return True
else:
ctx['dest_stream'].close()
self.report_error(
'fragment %s not found, unable to continue' % frag_index)
return False
else:
if skip_unavailable_fragments:
if not fatal:
self.report_skip_fragment(frag_index)
return True
else:
ctx['dest_stream'].close()
self.report_error(
'fragment %s not found, unable to continue' % frag_index)
return False

View File

@@ -290,11 +290,19 @@ class Aria2cFD(ExternalFD):
cmd += self._bool_option('--remote-time', 'updatetime', 'true', 'false', '=')
cmd += self._configuration_args()
# aria2c strips out spaces from the beginning/end of filenames and paths.
# We work around this issue by adding a "./" to the beginning of the
# filename and relative path, and adding a "/" at the end of the path.
# See: https://github.com/yt-dlp/yt-dlp/issues/276
# https://github.com/ytdl-org/youtube-dl/issues/20312
# https://github.com/aria2/aria2/issues/1373
dn = os.path.dirname(tmpfilename)
if dn:
cmd += ['--dir', dn]
if not os.path.isabs(dn):
dn = '.%s%s' % (os.path.sep, dn)
cmd += ['--dir', dn + os.path.sep]
if 'fragments' not in info_dict:
cmd += ['--out', os.path.basename(tmpfilename)]
cmd += ['--out', '.%s%s' % (os.path.sep, os.path.basename(tmpfilename))]
cmd += ['--auto-file-renaming=false']
if 'fragments' in info_dict:
@@ -330,7 +338,7 @@ class HttpieFD(ExternalFD):
class FFmpegFD(ExternalFD):
SUPPORTED_PROTOCOLS = ('http', 'https', 'ftp', 'ftps', 'm3u8', 'm3u8_native', 'rtsp', 'rtmp', 'mms')
SUPPORTED_PROTOCOLS = ('http', 'https', 'ftp', 'ftps', 'm3u8', 'm3u8_native', 'rtsp', 'rtmp', 'rtmp_ffmpeg', 'mms')
@classmethod
def available(cls, path=None):

View File

@@ -31,6 +31,7 @@ class FragmentFD(FileDownloader):
Skip unavailable fragments (DASH and hlsnative only)
keep_fragments: Keep downloaded fragments on disk after downloading is
finished
_no_ytdl_file: Don't use .ytdl file
For each incomplete fragment download yt-dlp keeps on disk a special
bookkeeping file with download state and metadata (in future such files will
@@ -69,15 +70,17 @@ class FragmentFD(FileDownloader):
self._prepare_frag_download(ctx)
self._start_frag_download(ctx)
@staticmethod
def __do_ytdl_file(ctx):
return not ctx['live'] and not ctx['tmpfilename'] == '-'
def __do_ytdl_file(self, ctx):
return not ctx['live'] and not ctx['tmpfilename'] == '-' and not self.params.get('_no_ytdl_file')
def _read_ytdl_file(self, ctx):
assert 'ytdl_corrupt' not in ctx
stream, _ = sanitize_open(self.ytdl_filename(ctx['filename']), 'r')
try:
ctx['fragment_index'] = json.loads(stream.read())['downloader']['current_fragment']['index']
ytdl_data = json.loads(stream.read())
ctx['fragment_index'] = ytdl_data['downloader']['current_fragment']['index']
if 'extra_state' in ytdl_data['downloader']:
ctx['extra_state'] = ytdl_data['downloader']['extra_state']
except Exception:
ctx['ytdl_corrupt'] = True
finally:
@@ -90,6 +93,8 @@ class FragmentFD(FileDownloader):
'index': ctx['fragment_index'],
},
}
if 'extra_state' in ctx:
downloader['extra_state'] = ctx['extra_state']
if ctx.get('fragment_count') is not None:
downloader['fragment_count'] = ctx['fragment_count']
frag_index_stream.write(json.dumps({'downloader': downloader}))

View File

@@ -1,6 +1,8 @@
from __future__ import unicode_literals
import errno
import re
import io
import binascii
try:
from Crypto.Cipher import AES
@@ -26,7 +28,9 @@ from ..utils import (
parse_m3u8_attributes,
sanitize_open,
update_url_query,
bug_reports_message,
)
from .. import webvtt
class HlsFD(FragmentFD):
@@ -77,6 +81,8 @@ class HlsFD(FragmentFD):
man_url = info_dict['url']
self.to_screen('[%s] Downloading m3u8 manifest' % self.FD_NAME)
is_webvtt = info_dict['ext'] == 'vtt'
urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url))
man_url = urlh.geturl()
s = urlh.read().decode('utf-8', 'ignore')
@@ -141,6 +147,8 @@ class HlsFD(FragmentFD):
else:
self._prepare_and_start_frag_download(ctx)
extra_state = ctx.setdefault('extra_state', {})
fragment_retries = self.params.get('fragment_retries', 0)
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
test = self.params.get('test', False)
@@ -291,6 +299,7 @@ class HlsFD(FragmentFD):
if count <= fragment_retries:
self.report_retry_fragment(err, frag_index, count, fragment_retries)
if count > fragment_retries:
ctx['dest_stream'].close()
self.report_error('Giving up after %s fragment retries' % fragment_retries)
return False, frag_index
@@ -307,28 +316,105 @@ class HlsFD(FragmentFD):
return frag_content, frag_index
pack_fragment = lambda frag_content, _: frag_content
if is_webvtt:
def pack_fragment(frag_content, frag_index):
output = io.StringIO()
adjust = 0
for block in webvtt.parse_fragment(frag_content):
if isinstance(block, webvtt.CueBlock):
block.start += adjust
block.end += adjust
dedup_window = extra_state.setdefault('webvtt_dedup_window', [])
cue = block.as_json
# skip the cue if an identical one appears
# in the window of potential duplicates
# and prune the window of unviable candidates
i = 0
skip = True
while i < len(dedup_window):
window_cue = dedup_window[i]
if window_cue == cue:
break
if window_cue['end'] >= cue['start']:
i += 1
continue
del dedup_window[i]
else:
skip = False
if skip:
continue
# add the cue to the window
dedup_window.append(cue)
elif isinstance(block, webvtt.Magic):
# take care of MPEG PES timestamp overflow
if block.mpegts is None:
block.mpegts = 0
extra_state.setdefault('webvtt_mpegts_adjust', 0)
block.mpegts += extra_state['webvtt_mpegts_adjust'] << 33
if block.mpegts < extra_state.get('webvtt_mpegts_last', 0):
extra_state['webvtt_mpegts_adjust'] += 1
block.mpegts += 1 << 33
extra_state['webvtt_mpegts_last'] = block.mpegts
if frag_index == 1:
extra_state['webvtt_mpegts'] = block.mpegts or 0
extra_state['webvtt_local'] = block.local or 0
# XXX: block.local = block.mpegts = None ?
else:
if block.mpegts is not None and block.local is not None:
adjust = (
(block.mpegts - extra_state.get('webvtt_mpegts', 0))
- (block.local - extra_state.get('webvtt_local', 0))
)
continue
elif isinstance(block, webvtt.HeaderBlock):
if frag_index != 1:
# XXX: this should probably be silent as well
# or verify that all segments contain the same data
self.report_warning(bug_reports_message(
'Discarding a %s block found in the middle of the stream; '
'if the subtitles display incorrectly,'
% (type(block).__name__)))
continue
block.write_into(output)
return output.getvalue().encode('utf-8')
def append_fragment(frag_content, frag_index):
fatal = frag_index == 1 or not skip_unavailable_fragments
if frag_content:
fragment_filename = '%s-Frag%d' % (ctx['tmpfilename'], frag_index)
try:
file, frag_sanitized = sanitize_open(fragment_filename, 'rb')
ctx['fragment_filename_sanitized'] = frag_sanitized
file.close()
frag_content = pack_fragment(frag_content, frag_index)
self._append_fragment(ctx, frag_content)
return True
except FileNotFoundError:
if skip_unavailable_fragments:
except EnvironmentError as ose:
if ose.errno != errno.ENOENT:
raise
# FileNotFoundError
if not fatal:
self.report_skip_fragment(frag_index)
return True
else:
ctx['dest_stream'].close()
self.report_error(
'fragment %s not found, unable to continue' % frag_index)
return False
else:
if skip_unavailable_fragments:
if not fatal:
self.report_skip_fragment(frag_index)
return True
else:
ctx['dest_stream'].close()
self.report_error(
'fragment %s not found, unable to continue' % frag_index)
return False

View File

@@ -48,7 +48,7 @@ def write_piff_header(stream, params):
language = params.get('language', 'und')
height = params.get('height', 0)
width = params.get('width', 0)
is_audio = width == 0 and height == 0
stream_type = params['stream_type']
creation_time = modification_time = int(time.time())
ftyp_payload = b'isml' # major brand
@@ -77,7 +77,7 @@ def write_piff_header(stream, params):
tkhd_payload += u32.pack(0) * 2 # reserved
tkhd_payload += s16.pack(0) # layer
tkhd_payload += s16.pack(0) # alternate group
tkhd_payload += s88.pack(1 if is_audio else 0) # volume
tkhd_payload += s88.pack(1 if stream_type == 'audio' else 0) # volume
tkhd_payload += u16.pack(0) # reserved
tkhd_payload += unity_matrix
tkhd_payload += u1616.pack(width)
@@ -93,19 +93,34 @@ def write_piff_header(stream, params):
mdia_payload = full_box(b'mdhd', 1, 0, mdhd_payload) # Media Header Box
hdlr_payload = u32.pack(0) # pre defined
hdlr_payload += b'soun' if is_audio else b'vide' # handler type
hdlr_payload += u32.pack(0) * 3 # reserved
hdlr_payload += (b'Sound' if is_audio else b'Video') + b'Handler\0' # name
if stream_type == 'audio': # handler type
hdlr_payload += b'soun'
hdlr_payload += u32.pack(0) * 3 # reserved
hdlr_payload += b'SoundHandler\0' # name
elif stream_type == 'video':
hdlr_payload += b'vide'
hdlr_payload += u32.pack(0) * 3 # reserved
hdlr_payload += b'VideoHandler\0' # name
elif stream_type == 'text':
hdlr_payload += b'subt'
hdlr_payload += u32.pack(0) * 3 # reserved
hdlr_payload += b'SubtitleHandler\0' # name
else:
assert False
mdia_payload += full_box(b'hdlr', 0, 0, hdlr_payload) # Handler Reference Box
if is_audio:
if stream_type == 'audio':
smhd_payload = s88.pack(0) # balance
smhd_payload += u16.pack(0) # reserved
media_header_box = full_box(b'smhd', 0, 0, smhd_payload) # Sound Media Header
else:
elif stream_type == 'video':
vmhd_payload = u16.pack(0) # graphics mode
vmhd_payload += u16.pack(0) * 3 # opcolor
media_header_box = full_box(b'vmhd', 0, 1, vmhd_payload) # Video Media Header
elif stream_type == 'text':
media_header_box = full_box(b'sthd', 0, 0, b'') # Subtitle Media Header
else:
assert False
minf_payload = media_header_box
dref_payload = u32.pack(1) # entry count
@@ -117,7 +132,7 @@ def write_piff_header(stream, params):
sample_entry_payload = u8.pack(0) * 6 # reserved
sample_entry_payload += u16.pack(1) # data reference index
if is_audio:
if stream_type == 'audio':
sample_entry_payload += u32.pack(0) * 2 # reserved
sample_entry_payload += u16.pack(params.get('channels', 2))
sample_entry_payload += u16.pack(params.get('bits_per_sample', 16))
@@ -127,7 +142,7 @@ def write_piff_header(stream, params):
if fourcc == 'AACL':
sample_entry_box = box(b'mp4a', sample_entry_payload)
else:
elif stream_type == 'video':
sample_entry_payload += u16.pack(0) # pre defined
sample_entry_payload += u16.pack(0) # reserved
sample_entry_payload += u32.pack(0) * 3 # pre defined
@@ -155,6 +170,18 @@ def write_piff_header(stream, params):
avcc_payload += pps
sample_entry_payload += box(b'avcC', avcc_payload) # AVC Decoder Configuration Record
sample_entry_box = box(b'avc1', sample_entry_payload) # AVC Simple Entry
else:
assert False
elif stream_type == 'text':
if fourcc == 'TTML':
sample_entry_payload += b'http://www.w3.org/ns/ttml\0' # namespace
sample_entry_payload += b'\0' # schema location
sample_entry_payload += b'\0' # auxilary mime types(??)
sample_entry_box = box(b'stpp', sample_entry_payload)
else:
assert False
else:
assert False
stsd_payload += sample_entry_box
stbl_payload = full_box(b'stsd', 0, 0, stsd_payload) # Sample Description Box
@@ -221,10 +248,13 @@ class IsmFD(FragmentFD):
self._prepare_and_start_frag_download(ctx)
extra_state = ctx.setdefault('extra_state', {
'ism_track_written': False,
})
fragment_retries = self.params.get('fragment_retries', 0)
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
track_written = False
frag_index = 0
for i, segment in enumerate(segments):
frag_index += 1
@@ -236,11 +266,11 @@ class IsmFD(FragmentFD):
success, frag_content = self._download_fragment(ctx, segment['url'], info_dict)
if not success:
return False
if not track_written:
if not extra_state['ism_track_written']:
tfhd_data = extract_box_data(frag_content, [b'moof', b'traf', b'tfhd'])
info_dict['_download_params']['track_id'] = u32.unpack(tfhd_data[4:8])[0]
write_piff_header(ctx['dest_stream'], info_dict['_download_params'])
track_written = True
extra_state['ism_track_written'] = True
self._append_fragment(ctx, frag_content)
break
except compat_urllib_error.HTTPError as err:

View File

@@ -24,16 +24,14 @@ class NiconicoDmcFD(FileDownloader):
success = download_complete = False
timer = [None]
heartbeat_lock = threading.Lock()
heartbeat_url = heartbeat_info_dict['url']
heartbeat_data = heartbeat_info_dict['data']
heartbeat_data = heartbeat_info_dict['data'].encode()
heartbeat_interval = heartbeat_info_dict.get('interval', 30)
self.to_screen('[%s] Heartbeat with %s second interval ...' % (self.FD_NAME, heartbeat_interval))
def heartbeat():
try:
compat_urllib_request.urlopen(url=heartbeat_url, data=heartbeat_data.encode())
compat_urllib_request.urlopen(url=heartbeat_url, data=heartbeat_data)
except Exception:
self.to_screen('[%s] Heartbeat failed' % self.FD_NAME)
@@ -42,13 +40,16 @@ class NiconicoDmcFD(FileDownloader):
timer[0] = threading.Timer(heartbeat_interval, heartbeat)
timer[0].start()
heartbeat_info_dict['ping']()
self.to_screen('[%s] Heartbeat with %d second interval ...' % (self.FD_NAME, heartbeat_interval))
try:
heartbeat()
if type(fd).__name__ == 'HlsFD':
info_dict.update(ie._extract_m3u8_formats(info_dict['url'], info_dict['id'])[0])
success = fd.real_download(filename, info_dict)
finally:
if heartbeat_lock:
with heartbeat_lock:
timer[0].cancel()
download_complete = True
return success

View File

@@ -12,9 +12,6 @@ except ImportError:
if not _LAZY_LOADER:
from .extractors import *
_PLUGIN_CLASSES = load_plugins('extractor', 'IE', globals())
_ALL_CLASSES = [
klass
for name, klass in globals().items()
@@ -22,6 +19,9 @@ if not _LAZY_LOADER:
]
_ALL_CLASSES.append(GenericIE)
_PLUGIN_CLASSES = load_plugins('extractor', 'IE', globals())
_ALL_CLASSES = _PLUGIN_CLASSES + _ALL_CLASSES
def gen_extractor_classes():
""" Return a list of supported extractors.

View File

@@ -86,18 +86,19 @@ class AtresPlayerIE(InfoExtractor):
title = episode['titulo']
formats = []
subtitles = {}
for source in episode.get('sources', []):
src = source.get('src')
if not src:
continue
src_type = source.get('type')
if src_type == 'application/vnd.apple.mpegurl':
formats.extend(self._extract_m3u8_formats(
formats, subtitles = self._extract_m3u8_formats(
src, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
m3u8_id='hls', fatal=False)
elif src_type == 'application/dash+xml':
formats.extend(self._extract_mpd_formats(
src, video_id, mpd_id='dash', fatal=False))
formats, subtitles = self._extract_mpd_formats(
src, video_id, mpd_id='dash', fatal=False)
self._sort_formats(formats)
heartbeat = episode.get('heartbeat') or {}
@@ -115,4 +116,5 @@ class AtresPlayerIE(InfoExtractor):
'channel': get_meta('channel'),
'season': get_meta('season'),
'episode_number': int_or_none(get_meta('episodeNumber')),
'subtitles': subtitles,
}

View File

@@ -78,8 +78,8 @@ class BlinkxIE(InfoExtractor):
'fullid': video_id,
'title': data['title'],
'formats': formats,
'uploader': data['channel_name'],
'timestamp': data['pubdate_epoch'],
'uploader': data.get('channel_name'),
'timestamp': data.get('pubdate_epoch'),
'description': data.get('description'),
'thumbnails': thumbnails,
'duration': duration,

View File

@@ -82,6 +82,7 @@ class BYUtvIE(InfoExtractor):
info = {}
formats = []
subtitles = {}
for format_id, ep in video.items():
if not isinstance(ep, dict):
continue
@@ -90,12 +91,16 @@ class BYUtvIE(InfoExtractor):
continue
ext = determine_ext(video_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
m3u8_fmts, m3u8_subs = self._extract_m3u8_formats_and_subtitles(
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
m3u8_id='hls', fatal=False)
formats.extend(m3u8_fmts)
subtitles = self._merge_subtitles(subtitles, m3u8_subs)
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
mpd_fmts, mpd_subs = self._extract_mpd_formats_and_subtitles(
video_url, video_id, mpd_id='dash', fatal=False)
formats.extend(mpd_fmts)
subtitles = self._merge_subtitles(subtitles, mpd_subs)
else:
formats.append({
'url': video_url,
@@ -114,4 +119,5 @@ class BYUtvIE(InfoExtractor):
'display_id': display_id,
'title': display_id,
'formats': formats,
'subtitles': subtitles,
})

View File

@@ -83,24 +83,31 @@ class CanvasIE(InfoExtractor):
description = data.get('description')
formats = []
subtitles = {}
for target in data['targetUrls']:
format_url, format_type = url_or_none(target.get('url')), str_or_none(target.get('type'))
if not format_url or not format_type:
continue
format_type = format_type.upper()
if format_type in self._HLS_ENTRY_PROTOCOLS_MAP:
formats.extend(self._extract_m3u8_formats(
fmts, subs = self._extract_m3u8_formats_and_subtitles(
format_url, video_id, 'mp4', self._HLS_ENTRY_PROTOCOLS_MAP[format_type],
m3u8_id=format_type, fatal=False))
m3u8_id=format_type, fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
elif format_type == 'HDS':
formats.extend(self._extract_f4m_formats(
format_url, video_id, f4m_id=format_type, fatal=False))
elif format_type == 'MPEG_DASH':
formats.extend(self._extract_mpd_formats(
format_url, video_id, mpd_id=format_type, fatal=False))
fmts, subs = self._extract_mpd_formats_and_subtitles(
format_url, video_id, mpd_id=format_type, fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
elif format_type == 'HSS':
formats.extend(self._extract_ism_formats(
format_url, video_id, ism_id='mss', fatal=False))
fmts, subs = self._extract_ism_formats_and_subtitles(
format_url, video_id, ism_id='mss', fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
else:
formats.append({
'format_id': format_type,
@@ -108,7 +115,6 @@ class CanvasIE(InfoExtractor):
})
self._sort_formats(formats)
subtitles = {}
subtitle_urls = data.get('subtitleUrls')
if isinstance(subtitle_urls, list):
for subtitle in subtitle_urls:

View File

@@ -27,7 +27,13 @@ class CBSBaseIE(ThePlatformFeedIE):
class CBSIE(CBSBaseIE):
_VALID_URL = r'(?:cbs:|https?://(?:www\.)?(?:(?:cbs|paramountplus)\.com/shows/[^/]+/video|colbertlateshow\.com/(?:video|podcasts))/)(?P<id>[\w-]+)'
_VALID_URL = r'''(?x)
(?:
cbs:|
https?://(?:www\.)?(?:
(?:cbs|paramountplus)\.com/(?:shows/[^/]+/video|movies/[^/]+)/|
colbertlateshow\.com/(?:video|podcasts)/)
)(?P<id>[\w-]+)'''
_TESTS = [{
'url': 'https://www.cbs.com/shows/garth-brooks/video/_u7W953k6la293J7EPTd9oHkSPs6Xn6_/connect-chat-feat-garth-brooks/',
@@ -55,6 +61,9 @@ class CBSIE(CBSBaseIE):
}, {
'url': 'https://www.paramountplus.com/shows/all-rise/video/QmR1WhNkh1a_IrdHZrbcRklm176X_rVc/all-rise-space/',
'only_matching': True,
}, {
'url': 'https://www.paramountplus.com/movies/million-dollar-american-princesses-meghan-and-harry/C0LpgNwXYeB8txxycdWdR9TjxpJOsdCq',
'only_matching': True,
}]
def _extract_video_info(self, content_id, site='cbs', mpx_acc=2198311517):

View File

@@ -133,6 +133,8 @@ class CDAIE(InfoExtractor):
'age_limit': 18 if need_confirm_age else 0,
}
info = self._search_json_ld(webpage, video_id, default={})
# Source: https://www.cda.pl/js/player.js?t=1606154898
def decrypt_file(a):
for p in ('_XDDD', '_CDA', '_ADC', '_CXD', '_QWE', '_Q5', '_IKSDE'):
@@ -197,7 +199,7 @@ class CDAIE(InfoExtractor):
handler = self._download_webpage
webpage = handler(
self._BASE_URL + href, video_id,
urljoin(self._BASE_URL, href), video_id,
'Downloading %s version information' % resolution, fatal=False)
if not webpage:
# Manually report warning because empty page is returned when
@@ -209,6 +211,4 @@ class CDAIE(InfoExtractor):
self._sort_formats(formats)
info = self._search_json_ld(webpage, video_id, default={})
return merge_dicts(info_dict, info)

View File

@@ -9,8 +9,6 @@ import netrc
import os
import random
import re
import socket
import ssl
import sys
import time
import math
@@ -58,6 +56,7 @@ from ..utils import (
js_to_json,
JSON_LD_RE,
mimetype2ext,
network_exceptions,
orderedSet,
parse_bitrate,
parse_codecs,
@@ -157,7 +156,7 @@ class InfoExtractor(object):
* player_url SWF Player URL (used for rtmpdump).
* protocol The protocol that will be used for the actual
download, lower-case.
"http", "https", "rtsp", "rtmp", "rtmpe",
"http", "https", "rtsp", "rtmp", "rtmp_ffmpeg", "rtmpe",
"m3u8", "m3u8_native" or "http_dash_segments".
* fragment_base_url
Base URL for fragments. Each fragment's path
@@ -558,6 +557,10 @@ class InfoExtractor(object):
ie_result = self._real_extract(url)
if self._x_forwarded_for_ip:
ie_result['__x_forwarded_for_ip'] = self._x_forwarded_for_ip
subtitles = ie_result.get('subtitles')
if (subtitles and 'live_chat' in subtitles
and 'no-live-chat' in self._downloader.params.get('compat_opts')):
del subtitles['live_chat']
return ie_result
except GeoRestrictedError as e:
if self.__maybe_fake_ip_and_retry(e.countries):
@@ -659,12 +662,9 @@ class InfoExtractor(object):
url_or_request = update_url_query(url_or_request, query)
if data is not None or headers:
url_or_request = sanitized_Request(url_or_request, data, headers)
exceptions = [compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error]
if hasattr(ssl, 'CertificateError'):
exceptions.append(ssl.CertificateError)
try:
return self._downloader.urlopen(url_or_request)
except tuple(exceptions) as err:
except network_exceptions as err:
if isinstance(err, compat_urllib_error.HTTPError):
if self.__can_accept_status_code(err, expected_status):
# Retain reference to error to prevent file object from
@@ -1419,7 +1419,10 @@ class InfoExtractor(object):
default = ('hidden', 'hasvid', 'ie_pref', 'lang', 'quality',
'res', 'fps', 'codec:vp9.2', 'size', 'br', 'asr',
'proto', 'ext', 'has_audio', 'source', 'format_id') # These must not be aliases
'proto', 'ext', 'hasaud', 'source', 'format_id') # These must not be aliases
ytdl_default = ('hasaud', 'quality', 'tbr', 'filesize', 'vbr',
'height', 'width', 'proto', 'vext', 'abr', 'aext',
'fps', 'fs_approx', 'source', 'format_id')
settings = {
'vcodec': {'type': 'ordered', 'regex': True,
@@ -1439,7 +1442,7 @@ class InfoExtractor(object):
'hasvid': {'priority': True, 'field': 'vcodec', 'type': 'boolean', 'not_in_list': ('none',)},
'hasaud': {'field': 'acodec', 'type': 'boolean', 'not_in_list': ('none',)},
'lang': {'priority': True, 'convert': 'ignore', 'field': 'language_preference'},
'quality': {'convert': 'float_none'},
'quality': {'convert': 'float_none', 'default': -1},
'filesize': {'convert': 'bytes'},
'fs_approx': {'convert': 'bytes', 'field': 'filesize_approx'},
'id': {'convert': 'string', 'field': 'format_id'},
@@ -1593,7 +1596,7 @@ class InfoExtractor(object):
def print_verbose_info(self, to_screen):
if self._sort_user:
to_screen('[debug] Sort order given by user: %s' % ','.join(self._sort_user))
to_screen('[debug] Sort order given by user: %s' % ', '.join(self._sort_user))
if self._sort_extractor:
to_screen('[debug] Sort order given by extractor: %s' % ', '.join(self._sort_extractor))
to_screen('[debug] Formats sorted by: %s' % ', '.join(['%s%s%s' % (
@@ -1621,7 +1624,7 @@ class InfoExtractor(object):
value = self._resolve_field_value(field, value, True)
# try to convert to number
val_num = float_or_none(value)
val_num = float_or_none(value, default=self._get_field_setting(field, 'default'))
is_num = self._get_field_setting(field, 'convert') != 'string' and val_num is not None
if is_num:
value = val_num
@@ -1879,11 +1882,21 @@ class InfoExtractor(object):
'format_note': 'Quality selection URL',
}
def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
entry_protocol='m3u8', preference=None, quality=None,
m3u8_id=None, note=None, errnote=None,
fatal=True, live=False, data=None, headers={},
query={}):
def _extract_m3u8_formats(self, *args, **kwargs):
fmts, subs = self._extract_m3u8_formats_and_subtitles(*args, **kwargs)
if subs:
self.report_warning(bug_reports_message(
"Ignoring subtitle tracks found in the HLS manifest; "
"if any subtitle tracks are missing,"
))
return fmts
def _extract_m3u8_formats_and_subtitles(
self, m3u8_url, video_id, ext=None, entry_protocol='m3u8',
preference=None, quality=None, m3u8_id=None, note=None,
errnote=None, fatal=True, live=False, data=None, headers={},
query={}):
res = self._download_webpage_handle(
m3u8_url, video_id,
note=note or 'Downloading m3u8 information',
@@ -1891,30 +1904,34 @@ class InfoExtractor(object):
fatal=fatal, data=data, headers=headers, query=query)
if res is False:
return []
return [], {}
m3u8_doc, urlh = res
m3u8_url = urlh.geturl()
return self._parse_m3u8_formats(
return self._parse_m3u8_formats_and_subtitles(
m3u8_doc, m3u8_url, ext=ext, entry_protocol=entry_protocol,
preference=preference, quality=quality, m3u8_id=m3u8_id,
note=note, errnote=errnote, fatal=fatal, live=live, data=data,
headers=headers, query=query, video_id=video_id)
def _parse_m3u8_formats(self, m3u8_doc, m3u8_url, ext=None,
entry_protocol='m3u8', preference=None, quality=None,
m3u8_id=None, live=False, note=None, errnote=None,
fatal=True, data=None, headers={}, query={}, video_id=None):
def _parse_m3u8_formats_and_subtitles(
self, m3u8_doc, m3u8_url, ext=None, entry_protocol='m3u8',
preference=None, quality=None, m3u8_id=None, live=False, note=None,
errnote=None, fatal=True, data=None, headers={}, query={},
video_id=None):
if '#EXT-X-FAXS-CM:' in m3u8_doc: # Adobe Flash Access
return []
return [], {}
if (not self._downloader.params.get('allow_unplayable_formats')
and re.search(r'#EXT-X-SESSION-KEY:.*?URI="skd://', m3u8_doc)): # Apple FairPlay
return []
return [], {}
formats = []
subtitles = {}
format_url = lambda u: (
u
if re.match(r'^https?://', u)
@@ -2001,7 +2018,7 @@ class InfoExtractor(object):
}
formats.append(f)
return formats
return formats, subtitles
groups = {}
last_stream_inf = {}
@@ -2013,6 +2030,21 @@ class InfoExtractor(object):
if not (media_type and group_id and name):
return
groups.setdefault(group_id, []).append(media)
# <https://tools.ietf.org/html/rfc8216#section-4.3.4.1>
if media_type == 'SUBTITLES':
lang = media['LANGUAGE'] # XXX: normalise?
url = format_url(media['URI'])
sub_info = {
'url': url,
'ext': determine_ext(url),
}
if sub_info['ext'] == 'm3u8':
# Per RFC 8216 §3.1, the only possible subtitle format m3u8
# files may contain is WebVTT:
# <https://tools.ietf.org/html/rfc8216#section-3.1>
sub_info['ext'] = 'vtt'
sub_info['protocol'] = 'm3u8_native'
subtitles.setdefault(lang, []).append(sub_info)
if media_type not in ('VIDEO', 'AUDIO'):
return
media_url = media.get('URI')
@@ -2160,7 +2192,7 @@ class InfoExtractor(object):
formats.append(http_f)
last_stream_inf = {}
return formats
return formats, subtitles
@staticmethod
def _xpath_ns(path, namespace=None):
@@ -2403,23 +2435,44 @@ class InfoExtractor(object):
})
return entries
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, data=None, headers={}, query={}):
def _extract_mpd_formats(self, *args, **kwargs):
fmts, subs = self._extract_mpd_formats_and_subtitles(*args, **kwargs)
if subs:
self.report_warning(bug_reports_message(
"Ignoring subtitle tracks found in the DASH manifest; "
"if any subtitle tracks are missing,"
))
return fmts
def _extract_mpd_formats_and_subtitles(
self, mpd_url, video_id, mpd_id=None, note=None, errnote=None,
fatal=True, data=None, headers={}, query={}):
res = self._download_xml_handle(
mpd_url, video_id,
note=note or 'Downloading MPD manifest',
errnote=errnote or 'Failed to download MPD manifest',
fatal=fatal, data=data, headers=headers, query=query)
if res is False:
return []
return [], {}
mpd_doc, urlh = res
if mpd_doc is None:
return []
return [], {}
mpd_base_url = base_url(urlh.geturl())
return self._parse_mpd_formats(
return self._parse_mpd_formats_and_subtitles(
mpd_doc, mpd_id, mpd_base_url, mpd_url)
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', mpd_url=None):
def _parse_mpd_formats(self, *args, **kwargs):
fmts, subs = self._parse_mpd_formats_and_subtitles(*args, **kwargs)
if subs:
self.report_warning(bug_reports_message(
"Ignoring subtitle tracks found in the DASH manifest; "
"if any subtitle tracks are missing,"
))
return fmts
def _parse_mpd_formats_and_subtitles(
self, mpd_doc, mpd_id=None, mpd_base_url='', mpd_url=None):
"""
Parse formats from MPD manifest.
References:
@@ -2429,7 +2482,7 @@ class InfoExtractor(object):
"""
if not self._downloader.params.get('dynamic_mpd', True):
if mpd_doc.get('type') == 'dynamic':
return []
return [], {}
namespace = self._search_regex(r'(?i)^{([^}]+)?}MPD$', mpd_doc.tag, 'namespace', default=None)
@@ -2501,6 +2554,7 @@ class InfoExtractor(object):
mpd_duration = parse_duration(mpd_doc.get('mediaPresentationDuration'))
formats = []
subtitles = {}
for period in mpd_doc.findall(_add_ns('Period')):
period_duration = parse_duration(period.get('duration')) or mpd_duration
period_ms_info = extract_multisegment_info(period, {
@@ -2518,11 +2572,9 @@ class InfoExtractor(object):
representation_attrib.update(representation.attrib)
# According to [1, 5.3.7.2, Table 9, page 41], @mimeType is mandatory
mime_type = representation_attrib['mimeType']
content_type = mime_type.split('/')[0]
if content_type == 'text':
# TODO implement WebVTT downloading
pass
elif content_type in ('video', 'audio'):
content_type = representation_attrib.get('contentType', mime_type.split('/')[0])
if content_type in ('video', 'audio', 'text'):
base_url = ''
for element in (representation, adaptation_set, period, mpd_doc):
base_url_e = element.find(_add_ns('BaseURL'))
@@ -2539,21 +2591,28 @@ class InfoExtractor(object):
url_el = representation.find(_add_ns('BaseURL'))
filesize = int_or_none(url_el.attrib.get('{http://youtube.com/yt/2012/10/10}contentLength') if url_el is not None else None)
bandwidth = int_or_none(representation_attrib.get('bandwidth'))
f = {
'format_id': '%s-%s' % (mpd_id, representation_id) if mpd_id else representation_id,
'manifest_url': mpd_url,
'ext': mimetype2ext(mime_type),
'width': int_or_none(representation_attrib.get('width')),
'height': int_or_none(representation_attrib.get('height')),
'tbr': float_or_none(bandwidth, 1000),
'asr': int_or_none(representation_attrib.get('audioSamplingRate')),
'fps': int_or_none(representation_attrib.get('frameRate')),
'language': lang if lang not in ('mul', 'und', 'zxx', 'mis') else None,
'format_note': 'DASH %s' % content_type,
'filesize': filesize,
'container': mimetype2ext(mime_type) + '_dash',
}
f.update(parse_codecs(representation_attrib.get('codecs')))
if content_type in ('video', 'audio'):
f = {
'format_id': '%s-%s' % (mpd_id, representation_id) if mpd_id else representation_id,
'manifest_url': mpd_url,
'ext': mimetype2ext(mime_type),
'width': int_or_none(representation_attrib.get('width')),
'height': int_or_none(representation_attrib.get('height')),
'tbr': float_or_none(bandwidth, 1000),
'asr': int_or_none(representation_attrib.get('audioSamplingRate')),
'fps': int_or_none(representation_attrib.get('frameRate')),
'language': lang if lang not in ('mul', 'und', 'zxx', 'mis') else None,
'format_note': 'DASH %s' % content_type,
'filesize': filesize,
'container': mimetype2ext(mime_type) + '_dash',
}
f.update(parse_codecs(representation_attrib.get('codecs')))
elif content_type == 'text':
f = {
'ext': mimetype2ext(mime_type),
'manifest_url': mpd_url,
'filesize': filesize,
}
representation_ms_info = extract_multisegment_info(representation, adaption_set_ms_info)
def prepare_template(template_name, identifiers):
@@ -2700,26 +2759,38 @@ class InfoExtractor(object):
else:
# Assuming direct URL to unfragmented media.
f['url'] = base_url
formats.append(f)
if content_type in ('video', 'audio'):
formats.append(f)
elif content_type == 'text':
subtitles.setdefault(lang or 'und', []).append(f)
else:
self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
return formats
return formats, subtitles
def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True, data=None, headers={}, query={}):
def _extract_ism_formats(self, *args, **kwargs):
fmts, subs = self._extract_ism_formats_and_subtitles(*args, **kwargs)
if subs:
self.report_warning(bug_reports_message(
"Ignoring subtitle tracks found in the ISM manifest; "
"if any subtitle tracks are missing,"
))
return fmts
def _extract_ism_formats_and_subtitles(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True, data=None, headers={}, query={}):
res = self._download_xml_handle(
ism_url, video_id,
note=note or 'Downloading ISM manifest',
errnote=errnote or 'Failed to download ISM manifest',
fatal=fatal, data=data, headers=headers, query=query)
if res is False:
return []
return [], {}
ism_doc, urlh = res
if ism_doc is None:
return []
return [], {}
return self._parse_ism_formats(ism_doc, urlh.geturl(), ism_id)
return self._parse_ism_formats_and_subtitles(ism_doc, urlh.geturl(), ism_id)
def _parse_ism_formats(self, ism_doc, ism_url, ism_id=None):
def _parse_ism_formats_and_subtitles(self, ism_doc, ism_url, ism_id=None):
"""
Parse formats from ISM manifest.
References:
@@ -2727,26 +2798,28 @@ class InfoExtractor(object):
https://msdn.microsoft.com/en-us/library/ff469518.aspx
"""
if ism_doc.get('IsLive') == 'TRUE':
return []
return [], {}
if (not self._downloader.params.get('allow_unplayable_formats')
and ism_doc.find('Protection') is not None):
return []
return [], {}
duration = int(ism_doc.attrib['Duration'])
timescale = int_or_none(ism_doc.get('TimeScale')) or 10000000
formats = []
subtitles = {}
for stream in ism_doc.findall('StreamIndex'):
stream_type = stream.get('Type')
if stream_type not in ('video', 'audio'):
if stream_type not in ('video', 'audio', 'text'):
continue
url_pattern = stream.attrib['Url']
stream_timescale = int_or_none(stream.get('TimeScale')) or timescale
stream_name = stream.get('Name')
stream_language = stream.get('Language', 'und')
for track in stream.findall('QualityLevel'):
fourcc = track.get('FourCC', 'AACL' if track.get('AudioTag') == '255' else None)
# TODO: add support for WVC1 and WMAP
if fourcc not in ('H264', 'AVC1', 'AACL'):
if fourcc not in ('H264', 'AVC1', 'AACL', 'TTML'):
self.report_warning('%s is not a supported codec' % fourcc)
continue
tbr = int(track.attrib['Bitrate']) // 1000
@@ -2789,33 +2862,52 @@ class InfoExtractor(object):
format_id.append(stream_name)
format_id.append(compat_str(tbr))
formats.append({
'format_id': '-'.join(format_id),
'url': ism_url,
'manifest_url': ism_url,
'ext': 'ismv' if stream_type == 'video' else 'isma',
'width': width,
'height': height,
'tbr': tbr,
'asr': sampling_rate,
'vcodec': 'none' if stream_type == 'audio' else fourcc,
'acodec': 'none' if stream_type == 'video' else fourcc,
'protocol': 'ism',
'fragments': fragments,
'_download_params': {
'duration': duration,
'timescale': stream_timescale,
'width': width or 0,
'height': height or 0,
'fourcc': fourcc,
'codec_private_data': track.get('CodecPrivateData'),
'sampling_rate': sampling_rate,
'channels': int_or_none(track.get('Channels', 2)),
'bits_per_sample': int_or_none(track.get('BitsPerSample', 16)),
'nal_unit_length_field': int_or_none(track.get('NALUnitLengthField', 4)),
},
})
return formats
if stream_type == 'text':
subtitles.setdefault(stream_language, []).append({
'ext': 'ismt',
'protocol': 'ism',
'url': ism_url,
'manifest_url': ism_url,
'fragments': fragments,
'_download_params': {
'stream_type': stream_type,
'duration': duration,
'timescale': stream_timescale,
'fourcc': fourcc,
'language': stream_language,
'codec_private_data': track.get('CodecPrivateData'),
}
})
elif stream_type in ('video', 'audio'):
formats.append({
'format_id': '-'.join(format_id),
'url': ism_url,
'manifest_url': ism_url,
'ext': 'ismv' if stream_type == 'video' else 'isma',
'width': width,
'height': height,
'tbr': tbr,
'asr': sampling_rate,
'vcodec': 'none' if stream_type == 'audio' else fourcc,
'acodec': 'none' if stream_type == 'video' else fourcc,
'protocol': 'ism',
'fragments': fragments,
'_download_params': {
'stream_type': stream_type,
'duration': duration,
'timescale': stream_timescale,
'width': width or 0,
'height': height or 0,
'fourcc': fourcc,
'language': stream_language,
'codec_private_data': track.get('CodecPrivateData'),
'sampling_rate': sampling_rate,
'channels': int_or_none(track.get('Channels', 2)),
'bits_per_sample': int_or_none(track.get('BitsPerSample', 16)),
'nal_unit_length_field': int_or_none(track.get('NALUnitLengthField', 4)),
},
})
return formats, subtitles
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None, preference=None, quality=None):
def absolute_url(item_url):
@@ -2940,7 +3032,16 @@ class InfoExtractor(object):
entries.append(media_info)
return entries
def _extract_akamai_formats(self, manifest_url, video_id, hosts={}):
def _extract_akamai_formats(self, *args, **kwargs):
fmts, subs = self._extract_akamai_formats_and_subtitles(*args, **kwargs)
if subs:
self.report_warning(bug_reports_message(
"Ignoring subtitle tracks found in the manifests; "
"if any subtitle tracks are missing,"
))
return fmts
def _extract_akamai_formats_and_subtitles(self, manifest_url, video_id, hosts={}):
signed = 'hdnea=' in manifest_url
if not signed:
# https://learn.akamai.com/en-us/webhelp/media-services-on-demand/stream-packaging-user-guide/GUID-BE6C0F73-1E06-483B-B0EA-57984B91B7F9.html
@@ -2949,6 +3050,7 @@ class InfoExtractor(object):
'', manifest_url).strip('?')
formats = []
subtitles = {}
hdcore_sign = 'hdcore=3.7.0'
f4m_url = re.sub(r'(https?://[^/]+)/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
@@ -2967,10 +3069,11 @@ class InfoExtractor(object):
hls_host = hosts.get('hls')
if hls_host:
m3u8_url = re.sub(r'(https?://)[^/]+', r'\1' + hls_host, m3u8_url)
m3u8_formats = self._extract_m3u8_formats(
m3u8_formats, m3u8_subtitles = self._extract_m3u8_formats_and_subtitles(
m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False)
formats.extend(m3u8_formats)
subtitles = self._merge_subtitles(subtitles, m3u8_subtitles)
http_host = hosts.get('http')
if http_host and m3u8_formats and not signed:
@@ -2994,7 +3097,7 @@ class InfoExtractor(object):
formats.append(http_f)
i += 1
return formats
return formats, subtitles
def _extract_wowza_formats(self, url, video_id, m3u8_entry_protocol='m3u8_native', skip_protocols=[]):
query = compat_urlparse.urlparse(url).query
@@ -3319,12 +3422,22 @@ class InfoExtractor(object):
return ret
@classmethod
def _merge_subtitles(cls, subtitle_dict1, subtitle_dict2):
""" Merge two subtitle dictionaries, language by language. """
ret = dict(subtitle_dict1)
for lang in subtitle_dict2:
ret[lang] = cls._merge_subtitle_items(subtitle_dict1.get(lang, []), subtitle_dict2[lang])
return ret
def _merge_subtitles(cls, *dicts, **kwargs):
""" Merge subtitle dictionaries, language by language. """
target = (lambda target=None: target)(**kwargs)
# The above lambda extracts the keyword argument 'target' from kwargs
# while ensuring there are no stray ones. When Python 2 support
# is dropped, remove it and change the function signature to:
#
# def _merge_subtitles(cls, *dicts, target=None):
if target is None:
target = {}
for d in dicts:
for lang, subs in d.items():
target[lang] = cls._merge_subtitle_items(target.get(lang, []), subs)
return target
def extract_automatic_captions(self, *args, **kwargs):
if (self._downloader.params.get('writeautomaticsub', False)

View File

@@ -12,6 +12,7 @@ from ..utils import (
determine_ext,
float_or_none,
int_or_none,
orderedSet,
parse_age_limit,
parse_duration,
url_or_none,
@@ -66,135 +67,179 @@ class CrackleIE(InfoExtractor):
},
}
def _download_json(self, url, *args, **kwargs):
# Authorization generation algorithm is reverse engineered from:
# https://www.sonycrackle.com/static/js/main.ea93451f.chunk.js
timestamp = time.strftime('%Y%m%d%H%M', time.gmtime())
h = hmac.new(b'IGSLUQCBDFHEOIFM', '|'.join([url, timestamp]).encode(), hashlib.sha1).hexdigest().upper()
headers = {
'Accept': 'application/json',
'Authorization': '|'.join([h, timestamp, '117', '1']),
}
return InfoExtractor._download_json(self, url, *args, headers=headers, **kwargs)
def _real_extract(self, url):
video_id = self._match_id(url)
country_code = self._downloader.params.get('geo_bypass_country', None)
countries = [country_code] if country_code else (
'US', 'AU', 'CA', 'AS', 'FM', 'GU', 'MP', 'PR', 'PW', 'MH', 'VI')
geo_bypass_country = self._downloader.params.get('geo_bypass_country', None)
countries = orderedSet((geo_bypass_country, 'US', 'AU', 'CA', 'AS', 'FM', 'GU', 'MP', 'PR', 'PW', 'MH', 'VI', ''))
num_countries, num = len(countries) - 1, 0
last_e = None
media = {}
for num, country in enumerate(countries):
if num == 1: # start hard-coded list
self.report_warning('%s. Trying with a list of known countries' % (
'Unable to obtain video formats from %s API' % geo_bypass_country if geo_bypass_country
else 'No country code was given using --geo-bypass-country'))
elif num == num_countries: # end of list
geo_info = self._download_json(
'https://web-api-us.crackle.com/Service.svc/geo/country',
video_id, fatal=False, note='Downloading geo-location information from crackle API',
errnote='Unable to fetch geo-location information from crackle') or {}
country = geo_info.get('CountryCode')
if country is None:
continue
self.to_screen('%s identified country as %s' % (self.IE_NAME, country))
if country in countries:
self.to_screen('Downloading from %s API was already attempted. Skipping...' % country)
continue
for country in countries:
if country is None:
continue
try:
# Authorization generation algorithm is reverse engineered from:
# https://www.sonycrackle.com/static/js/main.ea93451f.chunk.js
media_detail_url = 'https://web-api-us.crackle.com/Service.svc/details/media/%s/%s?disableProtocols=true' % (video_id, country)
timestamp = time.strftime('%Y%m%d%H%M', time.gmtime())
h = hmac.new(b'IGSLUQCBDFHEOIFM', '|'.join([media_detail_url, timestamp]).encode(), hashlib.sha1).hexdigest().upper()
media = self._download_json(
media_detail_url, video_id, 'Downloading media JSON as %s' % country,
'Unable to download media JSON', headers={
'Accept': 'application/json',
'Authorization': '|'.join([h, timestamp, '117', '1']),
})
'https://web-api-us.crackle.com/Service.svc/details/media/%s/%s?disableProtocols=true' % (video_id, country),
video_id, note='Downloading media JSON from %s API' % country,
errnote='Unable to download media JSON')
except ExtractorError as e:
# 401 means geo restriction, trying next country
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
last_e = e
continue
raise
media_urls = media.get('MediaURLs')
if not media_urls or not isinstance(media_urls, list):
status = media.get('status')
if status.get('messageCode') != '0':
raise ExtractorError(
'%s said: %s %s - %s' % (
self.IE_NAME, status.get('messageCodeDescription'), status.get('messageCode'), status.get('message')),
expected=True)
# Found video formats
if isinstance(media.get('MediaURLs'), list):
break
ignore_no_formats = self._downloader.params.get('ignore_no_formats_error')
allow_unplayable_formats = self._downloader.params.get('allow_unplayable_formats')
if not media or (not media.get('MediaURLs') and not ignore_no_formats):
raise ExtractorError(
'Unable to access the crackle API. Try passing your country code '
'to --geo-bypass-country. If it still does not work and the '
'video is available in your country')
title = media['Title']
formats, subtitles = [], {}
has_drm = False
for e in media.get('MediaURLs') or []:
if e.get('UseDRM'):
has_drm = True
if not allow_unplayable_formats:
continue
format_url = url_or_none(e.get('Path'))
if not format_url:
continue
title = media['Title']
formats = []
for e in media['MediaURLs']:
if not self._downloader.params.get('allow_unplayable_formats') and e.get('UseDRM') is True:
continue
format_url = url_or_none(e.get('Path'))
if not format_url:
continue
ext = determine_ext(format_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
format_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
format_url, video_id, mpd_id='dash', fatal=False))
elif format_url.endswith('.ism/Manifest'):
formats.extend(self._extract_ism_formats(
format_url, video_id, ism_id='mss', fatal=False))
else:
mfs_path = e.get('Type')
mfs_info = self._MEDIA_FILE_SLOTS.get(mfs_path)
if not mfs_info:
continue
formats.append({
'url': format_url,
'format_id': 'http-' + mfs_path.split('.')[0],
'width': mfs_info['width'],
'height': mfs_info['height'],
})
self._sort_formats(formats)
description = media.get('Description')
duration = int_or_none(media.get(
'DurationInSeconds')) or parse_duration(media.get('Duration'))
view_count = int_or_none(media.get('CountViews'))
average_rating = float_or_none(media.get('UserRating'))
age_limit = parse_age_limit(media.get('Rating'))
genre = media.get('Genre')
release_year = int_or_none(media.get('ReleaseYear'))
creator = media.get('Directors')
artist = media.get('Cast')
if media.get('MediaTypeDisplayValue') == 'Full Episode':
series = media.get('ShowName')
episode = title
season_number = int_or_none(media.get('Season'))
episode_number = int_or_none(media.get('Episode'))
ext = determine_ext(format_url)
if ext == 'm3u8':
fmts, subs = self._extract_m3u8_formats_and_subtitles(
format_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
elif ext == 'mpd':
fmts, subs = self._extract_mpd_formats_and_subtitles(
format_url, video_id, mpd_id='dash', fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
elif format_url.endswith('.ism/Manifest'):
fmts, subs = self._extract_ism_formats_and_subtitles(
format_url, video_id, ism_id='mss', fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
else:
series = episode = season_number = episode_number = None
mfs_path = e.get('Type')
mfs_info = self._MEDIA_FILE_SLOTS.get(mfs_path)
if not mfs_info:
continue
formats.append({
'url': format_url,
'format_id': 'http-' + mfs_path.split('.')[0],
'width': mfs_info['width'],
'height': mfs_info['height'],
})
if not formats and has_drm and not ignore_no_formats:
raise ExtractorError('The video is DRM protected', expected=True)
self._sort_formats(formats)
subtitles = {}
cc_files = media.get('ClosedCaptionFiles')
if isinstance(cc_files, list):
for cc_file in cc_files:
if not isinstance(cc_file, dict):
continue
cc_url = url_or_none(cc_file.get('Path'))
if not cc_url:
continue
lang = cc_file.get('Locale') or 'en'
subtitles.setdefault(lang, []).append({'url': cc_url})
description = media.get('Description')
duration = int_or_none(media.get(
'DurationInSeconds')) or parse_duration(media.get('Duration'))
view_count = int_or_none(media.get('CountViews'))
average_rating = float_or_none(media.get('UserRating'))
age_limit = parse_age_limit(media.get('Rating'))
genre = media.get('Genre')
release_year = int_or_none(media.get('ReleaseYear'))
creator = media.get('Directors')
artist = media.get('Cast')
thumbnails = []
images = media.get('Images')
if isinstance(images, list):
for image_key, image_url in images.items():
mobj = re.search(r'Img_(\d+)[xX](\d+)', image_key)
if not mobj:
continue
thumbnails.append({
'url': image_url,
'width': int(mobj.group(1)),
'height': int(mobj.group(2)),
})
if media.get('MediaTypeDisplayValue') == 'Full Episode':
series = media.get('ShowName')
episode = title
season_number = int_or_none(media.get('Season'))
episode_number = int_or_none(media.get('Episode'))
else:
series = episode = season_number = episode_number = None
return {
'id': video_id,
'title': title,
'description': description,
'duration': duration,
'view_count': view_count,
'average_rating': average_rating,
'age_limit': age_limit,
'genre': genre,
'creator': creator,
'artist': artist,
'release_year': release_year,
'series': series,
'episode': episode,
'season_number': season_number,
'episode_number': episode_number,
'thumbnails': thumbnails,
'subtitles': subtitles,
'formats': formats,
}
cc_files = media.get('ClosedCaptionFiles')
if isinstance(cc_files, list):
for cc_file in cc_files:
if not isinstance(cc_file, dict):
continue
cc_url = url_or_none(cc_file.get('Path'))
if not cc_url:
continue
lang = cc_file.get('Locale') or 'en'
subtitles.setdefault(lang, []).append({'url': cc_url})
raise last_e
thumbnails = []
images = media.get('Images')
if isinstance(images, list):
for image_key, image_url in images.items():
mobj = re.search(r'Img_(\d+)[xX](\d+)', image_key)
if not mobj:
continue
thumbnails.append({
'url': image_url,
'width': int(mobj.group(1)),
'height': int(mobj.group(2)),
})
return {
'id': video_id,
'title': title,
'description': description,
'duration': duration,
'view_count': view_count,
'average_rating': average_rating,
'age_limit': age_limit,
'genre': genre,
'creator': creator,
'artist': artist,
'release_year': release_year,
'series': series,
'episode': episode,
'season_number': season_number,
'episode_number': episode_number,
'thumbnails': thumbnails,
'subtitles': subtitles,
'formats': formats,
}

View File

@@ -143,32 +143,45 @@ class CuriosityStreamIE(CuriosityStreamBaseIE):
}
class CuriosityStreamCollectionIE(CuriosityStreamBaseIE):
IE_NAME = 'curiositystream:collection'
_VALID_URL = r'https?://(?:app\.)?curiositystream\.com/(?:collection|series)/(?P<id>\d+)'
class CuriosityStreamCollectionsIE(CuriosityStreamBaseIE):
IE_NAME = 'curiositystream:collections'
_VALID_URL = r'https?://(?:app\.)?curiositystream\.com/collections/(?P<id>\d+)'
_API_BASE_URL = 'https://api.curiositystream.com/v2/collections/'
_TESTS = [{
'url': 'https://app.curiositystream.com/collection/2',
'url': 'https://curiositystream.com/collections/86',
'info_dict': {
'id': '86',
'title': 'Staff Picks',
'description': 'Wondering where to start? Here are a few of our favorite series and films... from our couch to yours.',
},
'playlist_mincount': 7,
}]
def _real_extract(self, url):
collection_id = self._match_id(url)
collection = self._call_api(collection_id, collection_id)
entries = []
for media in collection.get('media', []):
media_id = compat_str(media.get('id'))
media_type, ie = ('series', CuriosityStreamSeriesIE) if media.get('is_collection') else ('video', CuriosityStreamIE)
entries.append(self.url_result(
'https://curiositystream.com/%s/%s' % (media_type, media_id),
ie=ie.ie_key(), video_id=media_id))
return self.playlist_result(
entries, collection_id,
collection.get('title'), collection.get('description'))
class CuriosityStreamSeriesIE(CuriosityStreamCollectionsIE):
IE_NAME = 'curiositystream:series'
_VALID_URL = r'https?://(?:app\.)?curiositystream\.com/series/(?P<id>\d+)'
_API_BASE_URL = 'https://api.curiositystream.com/v2/series/'
_TESTS = [{
'url': 'https://app.curiositystream.com/series/2',
'info_dict': {
'id': '2',
'title': 'Curious Minds: The Internet',
'description': 'How is the internet shaping our lives in the 21st Century?',
},
'playlist_mincount': 16,
}, {
'url': 'https://curiositystream.com/series/2',
'only_matching': True,
}]
def _real_extract(self, url):
collection_id = self._match_id(url)
collection = self._call_api(
'collections/' + collection_id, collection_id)
entries = []
for media in collection.get('media', []):
media_id = compat_str(media.get('id'))
entries.append(self.url_result(
'https://curiositystream.com/video/' + media_id,
CuriosityStreamIE.ie_key(), media_id))
return self.playlist_result(
entries, collection_id,
collection.get('title'), collection.get('description'))

View File

@@ -32,6 +32,18 @@ class DigitallySpeakingIE(InfoExtractor):
# From http://www.gdcvault.com/play/1013700/Advanced-Material
'url': 'http://sevt.dispeak.com/ubm/gdc/eur10/xml/11256_1282118587281VNIT.xml',
'only_matching': True,
}, {
# From https://gdcvault.com/play/1016624, empty speakerVideo
'url': 'https://sevt.dispeak.com/ubm/gdc/online12/xml/201210-822101_1349794556671DDDD.xml',
'info_dict': {
'id': '201210-822101_1349794556671DDDD',
'ext': 'flv',
'title': 'Pre-launch - Preparing to Take the Plunge',
},
}, {
# From http://www.gdcvault.com/play/1014846/Conference-Keynote-Shigeru, empty slideVideo
'url': 'http://events.digitallyspeaking.com/gdc/project25/xml/p25-miyamoto1999_1282467389849HSVB.xml',
'only_matching': True,
}]
def _parse_mp4(self, metadata):
@@ -85,25 +97,19 @@ class DigitallySpeakingIE(InfoExtractor):
'quality': 1,
'format_id': audio.get('code'),
})
slide_video_path = xpath_text(metadata, './slideVideo', fatal=True)
formats.append({
'url': 'rtmp://%s/ondemand?ovpfv=1.1' % akamai_url,
'play_path': remove_end(slide_video_path, '.flv'),
'ext': 'flv',
'format_note': 'slide deck video',
'quality': -2,
'format_id': 'slides',
'acodec': 'none',
})
speaker_video_path = xpath_text(metadata, './speakerVideo', fatal=True)
formats.append({
'url': 'rtmp://%s/ondemand?ovpfv=1.1' % akamai_url,
'play_path': remove_end(speaker_video_path, '.flv'),
'ext': 'flv',
'format_note': 'speaker video',
'quality': -1,
'format_id': 'speaker',
})
for video_key, format_id, preference in (
('slide', 'slides', -2), ('speaker', 'speaker', -1)):
video_path = xpath_text(metadata, './%sVideo' % video_key)
if not video_path:
continue
formats.append({
'url': 'rtmp://%s/ondemand?ovpfv=1.1' % akamai_url,
'play_path': remove_end(video_path, '.flv'),
'ext': 'flv',
'format_note': '%s video' % video_key,
'quality': preference,
'format_id': format_id,
})
return formats
def _real_extract(self, url):

View File

@@ -1,9 +1,7 @@
# coding: utf-8
from __future__ import unicode_literals
import os
import re
import tempfile
from .common import InfoExtractor
from ..utils import (
@@ -12,12 +10,12 @@ from ..utils import (
try_get,
)
from ..compat import compat_str
from ..downloader.hls import HlsFD
class ElonetIE(InfoExtractor):
_VALID_URL = r'https?://elonet\.finna\.fi/Record/kavi\.elonet_elokuva_(?P<id>[0-9]+)'
_TEST = {
_TESTS = [{
# m3u8 with subtitles
'url': 'https://elonet.finna.fi/Record/kavi.elonet_elokuva_107867',
'md5': '8efc954b96c543711707f87de757caea',
'info_dict': {
@@ -27,62 +25,17 @@ class ElonetIE(InfoExtractor):
'description': 'Valkoinen peura (1952) on Erik Blombergin ohjaama ja yhdessä Mirjami Kuosmasen kanssa käsikirjoittama tarunomainen kertomus valkoisen peuran hahmossa lii...',
'thumbnail': 'https://elonet.finna.fi/Cover/Show?id=kavi.elonet_elokuva_107867&index=0&size=large',
},
}
def _download_m3u8_chunked_subtitle(self, chunklist_url):
"""
Download VTT subtitles from pieces in manifest URL.
Return a string containing joined chunks with extra headers removed.
"""
with tempfile.NamedTemporaryFile(delete=True) as outfile:
fname = outfile.name
hlsdl = HlsFD(self._downloader, {})
hlsdl.download(compat_str(fname), {"url": chunklist_url})
with open(fname, 'r') as fin:
# Remove (some) headers
fdata = re.sub(r'X-TIMESTAMP-MAP.*\n+|WEBVTT\n+', '', fin.read())
os.remove(fname)
return "WEBVTT\n\n" + fdata
def _parse_m3u8_subtitles(self, m3u8_doc, m3u8_url):
"""
Parse subtitles from HLS / m3u8 manifest.
"""
subtitles = {}
baseurl = m3u8_url[:m3u8_url.rindex('/') + 1]
for line in m3u8_doc.split('\n'):
if 'EXT-X-MEDIA:TYPE=SUBTITLES' in line:
lang = self._search_regex(
r'LANGUAGE="(.+?)"', line, 'lang', default=False)
uri = self._search_regex(
r'URI="(.+?)"', line, 'uri', default=False)
if lang and uri:
data = self._download_m3u8_chunked_subtitle(baseurl + uri)
subtitles[lang] = [{'ext': 'vtt', 'data': data}]
return subtitles
def _parse_mpd_subtitles(self, mpd_doc):
"""
Parse subtitles from MPD manifest.
"""
ns = '{urn:mpeg:dash:schema:mpd:2011}'
subtitles = {}
for aset in mpd_doc.findall(".//%sAdaptationSet[@mimeType='text/vtt']" % (ns)):
lang = aset.attrib.get('lang', 'unk')
url = aset.find("./%sRepresentation/%sBaseURL" % (ns, ns)).text
subtitles[lang] = [{'ext': 'vtt', 'url': url}]
return subtitles
def _get_subtitles(self, fmt, doc, url):
if fmt == 'm3u8':
subs = self._parse_m3u8_subtitles(doc, url)
elif fmt == 'mpd':
subs = self._parse_mpd_subtitles(doc)
else:
self.report_warning(
"Cannot download subtitles from '%s' streams." % (fmt))
subs = {}
return subs
}, {
# DASH with subtitles
'url': 'https://elonet.finna.fi/Record/kavi.elonet_elokuva_116539',
'info_dict': {
'id': '116539',
'ext': 'mp4',
'title': 'Minulla on tiikeri',
'description': 'Pienellä pojalla, joka asuu kerrostalossa, on kotieläimenä tiikeri. Se on kuitenkin salaisuus. Kerrostalon räpätäti on Kotilaisen täti, joka on aina vali...',
'thumbnail': 'https://elonet.finna.fi/Cover/Show?id=kavi.elonet_elokuva_116539&index=0&size=large&source=Solr',
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -101,8 +54,8 @@ class ElonetIE(InfoExtractor):
self._parse_json(json_s, video_id),
lambda x: x[0]["src"], compat_str)
formats = []
subtitles = {}
if re.search(r'\.m3u8\??', src):
fmt = 'm3u8'
res = self._download_webpage_handle(
# elonet servers have certificate problems
src.replace('https:', 'http:'), video_id,
@@ -111,11 +64,10 @@ class ElonetIE(InfoExtractor):
if res:
doc, urlh = res
url = urlh.geturl()
formats = self._parse_m3u8_formats(doc, url)
formats, subtitles = self._parse_m3u8_formats_and_subtitles(doc, url)
for f in formats:
f['ext'] = 'mp4'
elif re.search(r'\.mpd\??', src):
fmt = 'mpd'
res = self._download_xml_handle(
src, video_id,
note='Downloading MPD manifest',
@@ -123,7 +75,7 @@ class ElonetIE(InfoExtractor):
if res:
doc, urlh = res
url = base_url(urlh.geturl())
formats = self._parse_mpd_formats(doc, mpd_base_url=url)
formats, subtitles = self._parse_mpd_formats_and_subtitles(doc, mpd_base_url=url)
else:
raise ExtractorError("Unknown streaming format")
@@ -133,5 +85,5 @@ class ElonetIE(InfoExtractor):
'description': description,
'thumbnail': thumbnail,
'formats': formats,
'subtitles': self.extract_subtitles(fmt, doc, url),
'subtitles': subtitles,
}

View File

@@ -151,7 +151,6 @@ from .bleacherreport import (
BleacherReportIE,
BleacherReportCMSIE,
)
from .blinkx import BlinkxIE
from .bloomberg import BloombergIE
from .bokecc import BokeCCIE
from .bongacams import BongaCamsIE
@@ -288,7 +287,8 @@ from .ctvnews import CTVNewsIE
from .cultureunplugged import CultureUnpluggedIE
from .curiositystream import (
CuriosityStreamIE,
CuriosityStreamCollectionIE,
CuriosityStreamCollectionsIE,
CuriosityStreamSeriesIE,
)
from .cwtv import CWTVIE
from .dailymail import DailyMailIE
@@ -760,7 +760,10 @@ from .mtv import (
)
from .muenchentv import MuenchenTVIE
from .mwave import MwaveIE, MwaveMeetGreetIE
from .mxplayer import MxplayerIE
from .mxplayer import (
MxplayerIE,
MxplayerShowIE,
)
from .mychannels import MyChannelsIE
from .myspace import MySpaceIE, MySpaceAlbumIE
from .myspass import MySpassIE
@@ -1438,6 +1441,7 @@ from .ufctv import (
UFCTVIE,
UFCArabiaIE,
)
from .ukcolumn import UkColumnIE
from .uktvplay import UKTVPlayIE
from .digiteka import DigitekaIE
from .dlive import (
@@ -1597,6 +1601,7 @@ from .weibo import (
)
from .weiqitv import WeiqiTVIE
from .wimtv import WimTVIE
from .whowatch import WhoWatchIE
from .wistia import (
WistiaIE,
WistiaPlaylistIE,

View File

@@ -3,14 +3,11 @@ from __future__ import unicode_literals
import json
import re
import socket
from .common import InfoExtractor
from ..compat import (
compat_etree_fromstring,
compat_http_client,
compat_str,
compat_urllib_error,
compat_urllib_parse_unquote,
compat_urllib_parse_unquote_plus,
)
@@ -23,6 +20,7 @@ from ..utils import (
int_or_none,
js_to_json,
limit_length,
network_exceptions,
parse_count,
qualities,
sanitized_Request,
@@ -370,7 +368,7 @@ class FacebookIE(InfoExtractor):
note='Confirming login')
if re.search(r'id="checkpointSubmitButton"', check_response) is not None:
self.report_warning('Unable to confirm login, you have to login in your browser and authorize the login.')
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
except network_exceptions as err:
self.report_warning('unable to log in: %s' % error_to_compat_str(err))
return

View File

@@ -151,6 +151,7 @@ class FranceTVIE(InfoExtractor):
videos.append(fallback_info['video'])
formats = []
subtitles = {}
for video in videos:
video_url = video.get('url')
if not video_url:
@@ -171,10 +172,12 @@ class FranceTVIE(InfoExtractor):
sign(video_url, format_id) + '&hdcore=3.7.0&plugin=aasp-3.7.0.39.44',
video_id, f4m_id=format_id, fatal=False))
elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
m3u8_fmts, m3u8_subs = self._extract_m3u8_formats_and_subtitles(
sign(video_url, format_id), video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id=format_id,
fatal=False))
fatal=False)
formats.extend(m3u8_fmts)
subtitles = self._merge_subtitles(subtitles, m3u8_subs)
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
sign(video_url, format_id), video_id, mpd_id=format_id, fatal=False))
@@ -199,13 +202,12 @@ class FranceTVIE(InfoExtractor):
title += ' - %s' % subtitle
title = title.strip()
subtitles = {}
subtitles_list = [{
'url': subformat['url'],
'ext': subformat.get('format'),
} for subformat in info.get('subtitles', []) if subformat.get('url')]
if subtitles_list:
subtitles['fr'] = subtitles_list
subtitles.setdefault('fr', []).extend(
[{
'url': subformat['url'],
'ext': subformat.get('format'),
} for subformat in info.get('subtitles', []) if subformat.get('url')]
)
return {
'id': video_id,
@@ -357,6 +359,22 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
'skip_download': True,
},
'add_ie': [FranceTVIE.ie_key()],
}, {
'note': 'Only an image exists in initial webpage instead of the video',
'url': 'https://www.francetvinfo.fr/sante/maladie/coronavirus/covid-19-en-inde-une-situation-catastrophique-a-new-dehli_4381095.html',
'info_dict': {
'id': '7d204c9e-a2d3-11eb-9e4c-000d3a23d482',
'ext': 'mp4',
'title': 'Covid-19 : une situation catastrophique à New Dehli',
'thumbnail': str,
'duration': 76,
'timestamp': 1619028518,
'upload_date': '20210421',
},
'params': {
'skip_download': True,
},
'add_ie': [FranceTVIE.ie_key()],
}, {
'url': 'http://www.francetvinfo.fr/elections/europeennes/direct-europeennes-regardez-le-debat-entre-les-candidats-a-la-presidence-de-la-commission_600639.html',
'only_matching': True,
@@ -384,6 +402,10 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
}, {
'url': 'http://france3-regions.francetvinfo.fr/limousin/emissions/jt-1213-limousin',
'only_matching': True,
}, {
# "<figure id=" pattern (#28792)
'url': 'https://www.francetvinfo.fr/culture/patrimoine/incendie-de-notre-dame-de-paris/notre-dame-de-paris-de-l-incendie-de-la-cathedrale-a-sa-reconstruction_4372291.html',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -401,7 +423,7 @@ class FranceTVInfoIE(FranceTVBaseInfoExtractor):
(r'player\.load[^;]+src:\s*["\']([^"\']+)',
r'id-video=([^@]+@[^"]+)',
r'<a[^>]+href="(?:https?:)?//videos\.francetv\.fr/video/([^@]+@[^"]+)"',
r'data-id=["\']([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'),
r'(?:data-id|<figure[^<]+\bid)=["\']([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'),
webpage, 'video id')
return self._make_url_result(video_id)

View File

@@ -16,7 +16,7 @@ from ..utils import (
class FunimationIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?funimation(?:\.com|now\.uk)/shows/[^/]+/(?P<id>[^/?#&]+)'
_VALID_URL = r'https?://(?:www\.)?funimation(?:\.com|now\.uk)/(?:[^/]+/)?shows/[^/]+/(?P<id>[^/?#&]+)'
_NETRC_MACHINE = 'funimation'
_TOKEN = None
@@ -51,6 +51,10 @@ class FunimationIE(InfoExtractor):
}, {
'url': 'https://www.funimationnow.uk/shows/puzzle-dragons-x/drop-impact/simulcast/',
'only_matching': True,
}, {
# with lang code
'url': 'https://www.funimation.com/en/shows/hacksign/role-play/',
'only_matching': True,
}]
def _login(self):

View File

@@ -5,7 +5,10 @@ import re
from .common import InfoExtractor
from .kaltura import KalturaIE
from ..utils import (
HEADRequest,
remove_start,
sanitized_Request,
smuggle_url,
urlencode_postdata,
)
@@ -100,6 +103,26 @@ class GDCVaultIE(InfoExtractor):
'format': 'mp4-408',
},
},
{
# Kaltura embed, whitespace between quote and embedded URL in iframe's src
'url': 'https://www.gdcvault.com/play/1025699',
'info_dict': {
'id': '0_zagynv0a',
'ext': 'mp4',
'title': 'Tech Toolbox',
'upload_date': '20190408',
'uploader_id': 'joe@blazestreaming.com',
'timestamp': 1554764629,
},
'params': {
'skip_download': True,
},
},
{
# HTML5 video
'url': 'http://www.gdcvault.com/play/1014846/Conference-Keynote-Shigeru',
'only_matching': True,
},
]
def _login(self, webpage_url, display_id):
@@ -120,38 +143,78 @@ class GDCVaultIE(InfoExtractor):
request = sanitized_Request(login_url, urlencode_postdata(login_form))
request.add_header('Content-Type', 'application/x-www-form-urlencoded')
self._download_webpage(request, display_id, 'Logging in')
webpage = self._download_webpage(webpage_url, display_id, 'Getting authenticated video page')
start_page = self._download_webpage(webpage_url, display_id, 'Getting authenticated video page')
self._download_webpage(logout_url, display_id, 'Logging out')
return webpage
return start_page
def _real_extract(self, url):
video_id, name = re.match(self._VALID_URL, url).groups()
display_id = name or video_id
webpage = self._download_webpage(url, display_id)
webpage_url = 'http://www.gdcvault.com/play/' + video_id
start_page = self._download_webpage(webpage_url, display_id)
title = self._html_search_regex(
r'<td><strong>Session Name:?</strong></td>\s*<td>(.*?)</td>',
webpage, 'title')
direct_url = self._search_regex(
r's1\.addVariable\("file",\s*encodeURIComponent\("(/[^"]+)"\)\);',
start_page, 'url', default=None)
if direct_url:
title = self._html_search_regex(
r'<td><strong>Session Name:?</strong></td>\s*<td>(.*?)</td>',
start_page, 'title')
video_url = 'http://www.gdcvault.com' + direct_url
# resolve the url so that we can detect the correct extension
video_url = self._request_webpage(
HEADRequest(video_url), video_id).geturl()
PLAYER_REGEX = r'<iframe src=\"(?P<manifest_url>.*?)\".*?</iframe>'
manifest_url = self._html_search_regex(
PLAYER_REGEX, webpage, 'manifest_url')
return {
'id': video_id,
'display_id': display_id,
'url': video_url,
'title': title,
}
partner_id = self._search_regex(
r'/p(?:artner_id)?/(\d+)', manifest_url, 'partner id',
default='1670711')
embed_url = KalturaIE._extract_url(start_page)
if embed_url:
embed_url = smuggle_url(embed_url, {'source_url': url})
ie_key = 'Kaltura'
else:
PLAYER_REGEX = r'<iframe src="(?P<xml_root>.+?)/(?:gdc-)?player.*?\.html.*?".*?</iframe>'
kaltura_id = self._search_regex(
r'entry_id=(?P<id>(?:[^&])+)', manifest_url,
'kaltura id', group='id')
xml_root = self._html_search_regex(
PLAYER_REGEX, start_page, 'xml root', default=None)
if xml_root is None:
# Probably need to authenticate
login_res = self._login(webpage_url, display_id)
if login_res is None:
self.report_warning('Could not login.')
else:
start_page = login_res
# Grab the url from the authenticated page
xml_root = self._html_search_regex(
PLAYER_REGEX, start_page, 'xml root')
xml_name = self._html_search_regex(
r'<iframe src=".*?\?xml(?:=|URL=xml/)(.+?\.xml).*?".*?</iframe>',
start_page, 'xml filename', default=None)
if not xml_name:
info = self._parse_html5_media_entries(url, start_page, video_id)[0]
info.update({
'title': remove_start(self._search_regex(
r'>Session Name:\s*<.*?>\s*<td>(.+?)</td>', start_page,
'title', default=None) or self._og_search_title(
start_page, default=None), 'GDC Vault - '),
'id': video_id,
'display_id': display_id,
})
return info
embed_url = '%s/xml/%s' % (xml_root, xml_name)
ie_key = 'DigitallySpeaking'
return {
'_type': 'url_transparent',
'url': 'kaltura:%s:%s' % (partner_id, kaltura_id),
'ie_key': KalturaIE.ie_key(),
'id': video_id,
'display_id': display_id,
'title': title,
'url': embed_url,
'ie_key': ie_key,
}

View File

@@ -2444,8 +2444,9 @@ class GenericIE(InfoExtractor):
m = re.match(r'^(?P<type>audio|video|application(?=/(?:ogg$|(?:vnd\.apple\.|x-)?mpegurl)))/(?P<format_id>[^;\s]+)', content_type)
if m:
format_id = compat_str(m.group('format_id'))
subtitles = {}
if format_id.endswith('mpegurl'):
formats = self._extract_m3u8_formats(url, video_id, 'mp4')
formats, subtitles = self._extract_m3u8_formats_and_subtitles(url, video_id, 'mp4')
elif format_id == 'f4m':
formats = self._extract_f4m_formats(url, video_id)
else:
@@ -2457,6 +2458,7 @@ class GenericIE(InfoExtractor):
info_dict['direct'] = True
self._sort_formats(formats)
info_dict['formats'] = formats
info_dict['subtitles'] = subtitles
return info_dict
if not self._downloader.params.get('test', False) and not is_intentional:
@@ -2506,11 +2508,14 @@ class GenericIE(InfoExtractor):
# Is it an RSS feed, a SMIL file, an XSPF playlist or a MPD manifest?
try:
doc = compat_etree_fromstring(webpage.encode('utf-8'))
try:
doc = compat_etree_fromstring(webpage)
except compat_xml_parse_error:
doc = compat_etree_fromstring(webpage.encode('utf-8'))
if doc.tag == 'rss':
return self._extract_rss(url, video_id, doc)
elif doc.tag == 'SmoothStreamingMedia':
info_dict['formats'] = self._parse_ism_formats(doc, url)
info_dict['formats'], info_dict['subtitles'] = self._parse_ism_formats_and_subtitles(doc, url)
self._sort_formats(info_dict['formats'])
return info_dict
elif re.match(r'^(?:{[^}]+})?smil$', doc.tag):
@@ -2524,7 +2529,7 @@ class GenericIE(InfoExtractor):
xspf_base_url=full_response.geturl()),
video_id)
elif re.match(r'(?i)^(?:{[^}]+})?MPD$', doc.tag):
info_dict['formats'] = self._parse_mpd_formats(
info_dict['formats'], info_dict['subtitles'] = self._parse_mpd_formats_and_subtitles(
doc,
mpd_base_url=full_response.geturl().rpartition('/')[0],
mpd_url=url)

View File

@@ -120,7 +120,7 @@ class KalturaIE(InfoExtractor):
def _extract_urls(webpage):
# Embed codes: https://knowledge.kaltura.com/embedding-kaltura-media-players-your-site
finditer = (
re.finditer(
list(re.finditer(
r"""(?xs)
kWidget\.(?:thumb)?[Ee]mbed\(
\{.*?
@@ -128,8 +128,8 @@ class KalturaIE(InfoExtractor):
(?P<q2>['"])_?(?P<partner_id>(?:(?!(?P=q2)).)+)(?P=q2),.*?
(?P<q3>['"])entry_?[Ii]d(?P=q3)\s*:\s*
(?P<q4>['"])(?P<id>(?:(?!(?P=q4)).)+)(?P=q4)(?:,|\s*\})
""", webpage)
or re.finditer(
""", webpage))
or list(re.finditer(
r'''(?xs)
(?P<q1>["'])
(?:https?:)?//cdnapi(?:sec)?\.kaltura\.com(?::\d+)?/(?:(?!(?P=q1)).)*\b(?:p|partner_id)/(?P<partner_id>\d+)(?:(?!(?P=q1)).)*
@@ -142,16 +142,16 @@ class KalturaIE(InfoExtractor):
\[\s*(?P<q2_1>["'])entry_?[Ii]d(?P=q2_1)\s*\]\s*=\s*
)
(?P<q3>["'])(?P<id>(?:(?!(?P=q3)).)+)(?P=q3)
''', webpage)
or re.finditer(
''', webpage))
or list(re.finditer(
r'''(?xs)
<(?:iframe[^>]+src|meta[^>]+\bcontent)=(?P<q1>["'])
<(?:iframe[^>]+src|meta[^>]+\bcontent)=(?P<q1>["'])\s*
(?:https?:)?//(?:(?:www|cdnapi(?:sec)?)\.)?kaltura\.com/(?:(?!(?P=q1)).)*\b(?:p|partner_id)/(?P<partner_id>\d+)
(?:(?!(?P=q1)).)*
[?&;]entry_id=(?P<id>(?:(?!(?P=q1))[^&])+)
(?:(?!(?P=q1)).)*
(?P=q1)
''', webpage)
''', webpage))
)
urls = []
for mobj in finditer:

View File

@@ -160,7 +160,10 @@ class LimelightBaseIE(InfoExtractor):
for mobile_url in mobile_item.get('mobileUrls', []):
media_url = mobile_url.get('mobileUrl')
format_id = mobile_url.get('targetMediaPlatform')
if not media_url or format_id in ('Widevine', 'SmoothStreaming') or media_url in urls:
if not media_url or media_url in urls:
continue
if (format_id in ('Widevine', 'SmoothStreaming')
and not self._downloader.params.get('allow_unplayable_formats', False)):
continue
urls.append(media_url)
ext = determine_ext(media_url)

View File

@@ -15,33 +15,39 @@ from ..utils import (
class MedalTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?medal\.tv/clips/(?P<id>[0-9]+)'
_VALID_URL = r'https?://(?:www\.)?medal\.tv/clips/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://medal.tv/clips/34934644/3Is9zyGMoBMr',
'url': 'https://medal.tv/clips/2mA60jWAGQCBH',
'md5': '7b07b064331b1cf9e8e5c52a06ae68fa',
'info_dict': {
'id': '34934644',
'id': '2mA60jWAGQCBH',
'ext': 'mp4',
'title': 'Quad Cold',
'description': 'Medal,https://medal.tv/desktop/',
'uploader': 'MowgliSB',
'timestamp': 1603165266,
'upload_date': '20201020',
'uploader_id': 10619174,
'uploader_id': '10619174',
}
}, {
'url': 'https://medal.tv/clips/36787208',
'url': 'https://medal.tv/clips/2um24TWdty0NA',
'md5': 'b6dc76b78195fff0b4f8bf4a33ec2148',
'info_dict': {
'id': '36787208',
'id': '2um24TWdty0NA',
'ext': 'mp4',
'title': 'u tk me i tk u bigger',
'description': 'Medal,https://medal.tv/desktop/',
'uploader': 'Mimicc',
'timestamp': 1605580939,
'upload_date': '20201117',
'uploader_id': 5156321,
'uploader_id': '5156321',
}
}, {
'url': 'https://medal.tv/clips/37rMeFpryCC-9',
'only_matching': True,
}, {
'url': 'https://medal.tv/clips/2WRj40tpY_EU9',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -26,7 +26,7 @@ _ID_RE = r'(?:[0-9a-f]{32,34}|[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0
class MediasiteIE(InfoExtractor):
_VALID_URL = r'(?xi)https?://[^/]+/Mediasite/(?:Play|Showcase/(?:default|livebroadcast)/Presentation)/(?P<id>%s)(?P<query>\?[^#]+|)' % _ID_RE
_VALID_URL = r'(?xi)https?://[^/]+/Mediasite/(?:Play|Showcase/[^/#?]+/Presentation)/(?P<id>%s)(?P<query>\?[^#]+|)' % _ID_RE
_TESTS = [
{
'url': 'https://hitsmediaweb.h-its.org/mediasite/Play/2db6c271681e4f199af3c60d1f82869b1d',

View File

@@ -3,6 +3,7 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
js_to_json,
@@ -14,7 +15,7 @@ from ..utils import (
class MxplayerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?mxplayer\.in/(?:show|movie)/(?:(?P<display_id>[-/a-z0-9]+)-)?(?P<id>[a-z0-9]+)'
_VALID_URL = r'https?://(?:www\.)?mxplayer\.in/(?:movie|show/[-\w]+/[-\w]+)/(?P<display_id>[-\w]+)-(?P<id>\w+)'
_TESTS = [{
'url': 'https://www.mxplayer.in/movie/watch-knock-knock-hindi-dubbed-movie-online-b9fa28df3bfb8758874735bbd7d2655a?watch=true',
'info_dict': {
@@ -117,7 +118,7 @@ class MxplayerIE(InfoExtractor):
self._sort_formats(formats)
return {
'id': video_id,
'display_id': display_id.replace('/', '-'),
'display_id': display_id,
'title': video_dict['title'] or self._og_search_title(webpage),
'formats': formats,
'description': video_dict.get('description'),
@@ -125,3 +126,46 @@ class MxplayerIE(InfoExtractor):
'series': try_get(video_dict, lambda x: x['container']['container']['title']),
'thumbnails': thumbnails,
}
class MxplayerShowIE(InfoExtractor):
_VALID_URL = r'(?:https?://)(?:www\.)?mxplayer\.in/show/(?P<display_id>[-\w]+)-(?P<id>\w+)/?(?:$|[#?])'
_TESTS = [{
'url': 'https://www.mxplayer.in/show/watch-chakravartin-ashoka-samrat-series-online-a8f44e3cc0814b5601d17772cedf5417',
'playlist_mincount': 440,
'info_dict': {
'id': 'a8f44e3cc0814b5601d17772cedf5417',
'title': 'Watch Chakravartin Ashoka Samrat Series Online',
}
}]
_API_SHOW_URL = "https://api.mxplay.com/v1/web/detail/tab/tvshowseasons?type=tv_show&id={}&device-density=2&platform=com.mxplay.desktop&content-languages=hi,en"
_API_EPISODES_URL = "https://api.mxplay.com/v1/web/detail/tab/tvshowepisodes?type=season&id={}&device-density=1&platform=com.mxplay.desktop&content-languages=hi,en&{}"
def _entries(self, show_id):
show_json = self._download_json(
self._API_SHOW_URL.format(show_id),
video_id=show_id, headers={'Referer': 'https://mxplayer.in'})
page_num = 0
for season in show_json.get('items') or []:
season_id = try_get(season, lambda x: x['id'], compat_str)
next_url = ''
while next_url is not None:
page_num += 1
season_json = self._download_json(
self._API_EPISODES_URL.format(season_id, next_url),
video_id=season_id,
headers={'Referer': 'https://mxplayer.in'},
note='Downloading JSON metadata page %d' % page_num)
for episode in season_json.get('items') or []:
video_url = episode['webUrl']
yield self.url_result(
'https://mxplayer.in%s' % video_url,
ie=MxplayerIE.ie_key(), video_id=video_url.split('-')[-1])
next_url = season_json.get('next')
def _real_extract(self, url):
display_id, show_id = re.match(self._VALID_URL, url).groups()
return self.playlist_result(
self._entries(show_id), playlist_id=show_id,
playlist_title=display_id.replace('-', ' ').title())

View File

@@ -15,10 +15,10 @@ from ..utils import (
class NebulaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?watchnebula\.com/videos/(?P<id>[-\w]+)'
_VALID_URL = r'https?://(?:www\.)?(?:watchnebula\.com|nebula\.app)/videos/(?P<id>[-\w]+)'
_TESTS = [
{
'url': 'https://watchnebula.com/videos/that-time-disney-remade-beauty-and-the-beast',
'url': 'https://nebula.app/videos/that-time-disney-remade-beauty-and-the-beast',
'md5': 'fe79c4df8b3aa2fea98a93d027465c7e',
'info_dict': {
'id': '5c271b40b13fd613090034fd',
@@ -36,7 +36,7 @@ class NebulaIE(InfoExtractor):
'skip': 'All Nebula content requires authentication',
},
{
'url': 'https://watchnebula.com/videos/the-logistics-of-d-day-landing-craft-how-the-allies-got-ashore',
'url': 'https://nebula.app/videos/the-logistics-of-d-day-landing-craft-how-the-allies-got-ashore',
'md5': '6d4edd14ce65720fa63aba5c583fb328',
'info_dict': {
'id': '5e7e78171aaf320001fbd6be',
@@ -54,7 +54,7 @@ class NebulaIE(InfoExtractor):
'skip': 'All Nebula content requires authentication',
},
{
'url': 'https://watchnebula.com/videos/money-episode-1-the-draw',
'url': 'https://nebula.app/videos/money-episode-1-the-draw',
'md5': '8c7d272910eea320f6f8e6d3084eecf5',
'info_dict': {
'id': '5e779ebdd157bc0001d1c75a',
@@ -71,6 +71,10 @@ class NebulaIE(InfoExtractor):
},
'skip': 'All Nebula content requires authentication',
},
{
'url': 'https://watchnebula.com/videos/money-episode-1-the-draw',
'only_matching': True,
},
]
_NETRC_MACHINE = 'watchnebula'

View File

@@ -164,6 +164,11 @@ class NiconicoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.|secure\.|sp\.)?nicovideo\.jp/watch/(?P<id>(?:[a-z]{2})?[0-9]+)'
_NETRC_MACHINE = 'niconico'
_API_HEADERS = {
'X-Frontend-ID': '6',
'X-Frontend-Version': '0'
}
def _real_initialize(self):
self._login()
@@ -197,46 +202,48 @@ class NiconicoIE(InfoExtractor):
video_id, video_src_id, audio_src_id = info_dict['url'].split(':')[1].split('/')
# Get video webpage for API data.
webpage, handle = self._download_webpage_handle(
'http://www.nicovideo.jp/watch/' + video_id, video_id)
api_data = self._parse_json(self._html_search_regex(
'data-api-data="([^"]+)"', webpage,
'API data', default='{}'), video_id)
api_data = (
info_dict.get('_api_data')
or self._parse_json(
self._html_search_regex(
'data-api-data="([^"]+)"',
self._download_webpage('http://www.nicovideo.jp/watch/' + video_id, video_id),
'API data', default='{}'),
video_id))
session_api_data = try_get(api_data, lambda x: x['media']['delivery']['movie']['session'])
session_api_endpoint = try_get(session_api_data, lambda x: x['urls'][0])
# ping
self._download_json(
'https://nvapi.nicovideo.jp/v1/2ab0cbaa/watch', video_id,
query={'t': try_get(api_data, lambda x: x['video']['dmcInfo']['tracking_id'])},
headers={
'Origin': 'https://www.nicovideo.jp',
'Referer': 'https://www.nicovideo.jp/watch/' + video_id,
'X-Frontend-Id': '6',
'X-Frontend-Version': '0'
})
def ping():
status = try_get(
self._download_json(
'https://nvapi.nicovideo.jp/v1/2ab0cbaa/watch', video_id,
query={'t': try_get(api_data, lambda x: x['media']['delivery']['trackingId'])},
note='Acquiring permission for downloading video',
headers=self._API_HEADERS),
lambda x: x['meta']['status'])
if status != 200:
self.report_warning('Failed to acquire permission for playing video. The video may not download.')
yesno = lambda x: 'yes' if x else 'no'
# m3u8 (encryption)
if 'encryption' in (try_get(api_data, lambda x: x['media']['delivery']['movie']) or {}):
if try_get(api_data, lambda x: x['media']['delivery']['encryption']) is not None:
protocol = 'm3u8'
encryption = self._parse_json(session_api_data['token'], video_id)['hls_encryption']
session_api_http_parameters = {
'parameters': {
'hls_parameters': {
'encryption': {
'hls_encryption_v1': {
'encrypted_key': try_get(api_data, lambda x: x['video']['dmcInfo']['encryption']['hls_encryption_v1']['encrypted_key']),
'key_uri': try_get(api_data, lambda x: x['video']['dmcInfo']['encryption']['hls_encryption_v1']['key_uri'])
encryption: {
'encrypted_key': try_get(api_data, lambda x: x['media']['delivery']['encryption']['encryptedKey']),
'key_uri': try_get(api_data, lambda x: x['media']['delivery']['encryption']['keyUri'])
}
},
'transfer_preset': '',
'use_ssl': yesno(session_api_endpoint['is_ssl']),
'use_well_known_port': yesno(session_api_endpoint['is_well_known_port']),
'segment_duration': 6000
'use_ssl': yesno(session_api_endpoint['isSsl']),
'use_well_known_port': yesno(session_api_endpoint['isWellKnownPort']),
'segment_duration': 6000,
}
}
}
@@ -310,7 +317,8 @@ class NiconicoIE(InfoExtractor):
'url': session_api_endpoint['url'] + '/' + session_response['data']['session']['id'] + '?_format=json&_method=PUT',
'data': json.dumps(session_response['data']),
# interval, convert milliseconds to seconds, then halve to make a buffer.
'interval': float_or_none(session_api_data.get('heartbeatLifetime'), scale=2000),
'interval': float_or_none(session_api_data.get('heartbeatLifetime'), scale=3000),
'ping': ping
}
return info_dict, heartbeat_info_dict
@@ -400,7 +408,7 @@ class NiconicoIE(InfoExtractor):
# Get HTML5 videos info
quality_info = try_get(api_data, lambda x: x['media']['delivery']['movie'])
if not quality_info:
raise ExtractorError('The video can\'t downloaded.', expected=True)
raise ExtractorError('The video can\'t be downloaded', expected=True)
for audio_quality in quality_info.get('audios') or {}:
for video_quality in quality_info.get('videos') or {}:
@@ -412,9 +420,7 @@ class NiconicoIE(InfoExtractor):
# Get flv/swf info
timestamp = None
video_real_url = try_get(api_data, lambda x: x['video']['smileInfo']['url'])
if not video_real_url:
self.report_warning('Unable to obtain smile video information')
else:
if video_real_url:
is_economy = video_real_url.endswith('low')
if is_economy:
@@ -486,14 +492,12 @@ class NiconicoIE(InfoExtractor):
'filesize': filesize
})
if len(formats) == 0:
raise ExtractorError('Unable to find video info.')
self._sort_formats(formats)
# Start extracting information
title = (
get_video_info_web(['originalTitle', 'title'])
get_video_info_xml('title') # prefer to get the untranslated original title
or get_video_info_web(['originalTitle', 'title'])
or self._og_search_title(webpage, default=None)
or self._html_search_regex(
r'<span[^>]+class="videoHeaderTitle"[^>]*>([^<]+)</span>',
@@ -507,7 +511,9 @@ class NiconicoIE(InfoExtractor):
thumbnail = (
self._html_search_regex(r'<meta property="og:image" content="([^"]+)">', webpage, 'thumbnail data', default=None)
or get_video_info_web(['thumbnail_url', 'largeThumbnailURL', 'thumbnailURL'])
or dict_get( # choose highest from 720p to 240p
get_video_info_web('thumbnail'),
['ogp', 'player', 'largeUrl', 'middleUrl', 'url'])
or self._html_search_meta('image', webpage, 'thumbnail', default=None)
or video_detail.get('thumbnail'))
@@ -582,6 +588,7 @@ class NiconicoIE(InfoExtractor):
return {
'id': video_id,
'_api_data': api_data,
'title': title,
'formats': formats,
'thumbnail': thumbnail,
@@ -616,24 +623,19 @@ class NiconicoPlaylistIE(InfoExtractor):
'only_matching': True,
}]
_API_HEADERS = {
'X-Frontend-ID': '6',
'X-Frontend-Version': '0'
}
def _real_extract(self, url):
list_id = self._match_id(url)
webpage = self._download_webpage(url, list_id)
header = self._parse_json(self._html_search_regex(
r'data-common-header="([^"]+)"', webpage,
'webpage header'), list_id)
frontendId = header.get('initConfig').get('frontendId')
frontendVersion = header.get('initConfig').get('frontendVersion')
def get_page_data(pagenum, pagesize):
return self._download_json(
'http://nvapi.nicovideo.jp/v2/mylists/' + list_id, list_id,
query={'page': 1 + pagenum, 'pageSize': pagesize},
headers={
'X-Frontend-Id': frontendId,
'X-Frontend-Version': frontendVersion,
}).get('data').get('mylist')
headers=self._API_HEADERS).get('data').get('mylist')
data = get_page_data(0, 1)
title = data.get('name')
@@ -669,20 +671,20 @@ class NiconicoUserIE(InfoExtractor):
'playlist_mincount': 101,
}
_API_URL = "https://nvapi.nicovideo.jp/v1/users/%s/videos?sortKey=registeredAt&sortOrder=desc&pageSize=%s&page=%s"
_api_headers = {
'X-Frontend-ID': '6',
'X-Frontend-Version': '0',
'X-Niconico-Language': 'en-us'
}
_PAGE_SIZE = 100
_API_HEADERS = {
'X-Frontend-ID': '6',
'X-Frontend-Version': '0'
}
def _entries(self, list_id, ):
total_count = 1
count = page_num = 0
while count < total_count:
json_parsed = self._download_json(
self._API_URL % (list_id, self._PAGE_SIZE, page_num + 1), list_id,
headers=self._api_headers,
headers=self._API_HEADERS,
note='Downloading JSON metadata%s' % (' page %d' % page_num if page_num else ''))
if not page_num:
total_count = int_or_none(json_parsed['data'].get('totalCount'))

View File

@@ -46,6 +46,7 @@ class NYTimesBaseIE(InfoExtractor):
urls = []
formats = []
subtitles = {}
for video in video_data.get('renditions', []):
video_url = video.get('url')
format_id = video.get('type')
@@ -54,9 +55,11 @@ class NYTimesBaseIE(InfoExtractor):
urls.append(video_url)
ext = mimetype2ext(video.get('mimetype')) or determine_ext(video_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
m3u8_fmts, m3u8_subs = self._extract_m3u8_formats_and_subtitles(
video_url, video_id, 'mp4', 'm3u8_native',
m3u8_id=format_id or 'hls', fatal=False))
m3u8_id=format_id or 'hls', fatal=False)
formats.extend(m3u8_fmts)
subtitles = self._merge_subtitles(subtitles, m3u8_subs)
elif ext == 'mpd':
continue
# formats.extend(self._extract_mpd_formats(
@@ -96,6 +99,7 @@ class NYTimesBaseIE(InfoExtractor):
'uploader': video_data.get('byline'),
'duration': float_or_none(video_data.get('duration'), 1000),
'formats': formats,
'subtitles': subtitles,
'thumbnails': thumbnails,
}

View File

@@ -78,45 +78,60 @@ class PlutoTVIE(InfoExtractor):
},
]
def _to_ad_free_formats(self, video_id, formats):
ad_free_formats = []
m3u8_urls = set()
for format in formats:
def _to_ad_free_formats(self, video_id, formats, subtitles):
ad_free_formats, ad_free_subtitles, m3u8_urls = [], {}, set()
for fmt in formats:
res = self._download_webpage(
format.get('url'), video_id, note='Downloading m3u8 playlist',
fmt.get('url'), video_id, note='Downloading m3u8 playlist',
fatal=False)
if not res:
continue
first_segment_url = re.search(
r'^(https?://.*/)0\-(end|[0-9]+)/[^/]+\.ts$', res,
re.MULTILINE)
if not first_segment_url:
if first_segment_url:
m3u8_urls.add(
compat_urlparse.urljoin(first_segment_url.group(1), '0-end/master.m3u8'))
continue
first_segment_url = re.search(
r'^(https?://.*/).+\-0+\.ts$', res,
re.MULTILINE)
if first_segment_url:
m3u8_urls.add(
compat_urlparse.urljoin(first_segment_url.group(1), 'master.m3u8'))
continue
m3u8_urls.add(
compat_urlparse.urljoin(first_segment_url.group(1), '0-end/master.m3u8'))
for m3u8_url in m3u8_urls:
ad_free_formats.extend(
self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
self._sort_formats(ad_free_formats)
return ad_free_formats
fmts, subs = self._extract_m3u8_formats_and_subtitles(
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False)
ad_free_formats.extend(fmts)
ad_free_subtitles = self._merge_subtitles(ad_free_subtitles, subs)
if ad_free_formats:
formats, subtitles = ad_free_formats, ad_free_subtitles
else:
self._downloader.report_warning('Unable to find ad-free formats')
return formats, subtitles
def _get_video_info(self, video_json, slug, series_name=None):
video_id = video_json.get('_id', slug)
formats = []
formats, subtitles = [], {}
for video_url in try_get(video_json, lambda x: x['stitched']['urls'], list) or []:
if video_url.get('type') != 'hls':
continue
url = url_or_none(video_url.get('url'))
formats.extend(
self._extract_m3u8_formats(
url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
fmts, subs = self._extract_m3u8_formats_and_subtitles(
url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
formats, subtitles = self._to_ad_free_formats(video_id, formats, subtitles)
self._sort_formats(formats)
info = {
'id': video_id,
'formats': self._to_ad_free_formats(video_id, formats),
'formats': formats,
'subtitles': subtitles,
'title': video_json.get('name'),
'description': video_json.get('description'),
'duration': float_or_none(video_json.get('duration'), scale=1000),

View File

@@ -13,9 +13,24 @@ from ..utils import smuggle_url
class RMCDecouverteIE(InfoExtractor):
_VALID_URL = r'https?://rmcdecouverte\.bfmtv\.com/(?:(?:[^/]+/)*program_(?P<id>\d+)|(?P<live_id>mediaplayer-direct))'
_VALID_URL = r'https?://rmcdecouverte\.bfmtv\.com/(?:[^/]+/(?P<id>[^?#/]+)|(?P<live_id>mediaplayer-direct))'
_TESTS = [{
'url': 'https://rmcdecouverte.bfmtv.com/vestiges-de-guerre_22240/les-bunkers-secrets-domaha-beach_25303/',
'info_dict': {
'id': '6250879771001',
'ext': 'mp4',
'title': 'LES BUNKERS SECRETS D´OMAHA BEACH',
'uploader_id': '1969646226001',
'description': 'md5:aed573ca24abde62a148e0eba909657d',
'timestamp': 1619622984,
'upload_date': '20210428',
},
'params': {
'format': 'bestvideo',
'skip_download': True,
},
}, {
'url': 'https://rmcdecouverte.bfmtv.com/wheeler-dealers-occasions-a-saisir/program_2566/',
'info_dict': {
'id': '5983675500001',

View File

@@ -103,7 +103,7 @@ class RoosterTeethIE(InfoExtractor):
api_episode_url + '/videos', display_id,
'Downloading video JSON metadata')['data'][0]
m3u8_url = video_data['attributes']['url']
subtitle_m3u8_url = video_data['links']['download']
# XXX: additional URL at video_data['links']['download']
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
if self._parse_json(e.cause.read().decode(), display_id).get('access') is False:
@@ -111,7 +111,7 @@ class RoosterTeethIE(InfoExtractor):
'%s is only available for FIRST members' % display_id)
raise
formats = self._extract_m3u8_formats(
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
m3u8_url, display_id, 'mp4', 'm3u8_native', m3u8_id='hls')
self._sort_formats(formats)
@@ -134,33 +134,6 @@ class RoosterTeethIE(InfoExtractor):
'url': img_url,
})
subtitles = {}
res = self._download_webpage_handle(
subtitle_m3u8_url, display_id,
'Downloading m3u8 information',
'Failed to download m3u8 information',
fatal=True, data=None, headers={}, query={})
if res is not False:
subtitle_m3u8_doc, _ = res
for line in subtitle_m3u8_doc.split('\n'):
if 'EXT-X-MEDIA:TYPE=SUBTITLES' in line:
parts = line.split(',')
for part in parts:
if 'LANGUAGE' in part:
lang = part[part.index('=') + 2:-1]
elif 'URI' in part:
uri = part[part.index('=') + 2:-1]
res = self._download_webpage_handle(
uri, display_id,
'Downloading m3u8 information',
'Failed to download m3u8 information',
fatal=True, data=None, headers={}, query={})
doc, _ = res
for l in doc.split('\n'):
if not l.startswith('#'):
subtitles[lang] = [{'url': uri[:-uri[::-1].index('/')] + l}]
break
return {
'id': video_id,
'display_id': display_id,

View File

@@ -17,7 +17,7 @@ class SonyLIVIE(InfoExtractor):
_TESTS = [{
'url': 'https://www.sonyliv.com/shows/bachelors-delight-1700000113/achaari-cheese-toast-1000022678?watch=true',
'info_dict': {
'title': 'Bachelors Delight - Achaari Cheese Toast',
'title': 'Achaari Cheese Toast',
'id': '1000022678',
'ext': 'mp4',
'upload_date': '20200411',
@@ -25,7 +25,7 @@ class SonyLIVIE(InfoExtractor):
'timestamp': 1586632091,
'duration': 185,
'season_number': 1,
'episode': 'Achaari Cheese Toast',
'series': 'Bachelors Delight',
'episode_number': 1,
'release_year': 2016,
},
@@ -92,10 +92,7 @@ class SonyLIVIE(InfoExtractor):
metadata = self._call_api(
'1.6', 'IN/DETAIL/' + video_id, video_id)['containers'][0]['metadata']
title = metadata['title']
episode = metadata.get('episodeTitle')
if episode and title != episode:
title += ' - ' + episode
title = metadata['episodeTitle']
return {
'id': video_id,
@@ -106,7 +103,7 @@ class SonyLIVIE(InfoExtractor):
'timestamp': int_or_none(metadata.get('creationDate'), 1000),
'duration': int_or_none(metadata.get('duration')),
'season_number': int_or_none(metadata.get('season')),
'episode': episode,
'series': metadata.get('title'),
'episode_number': int_or_none(metadata.get('episodeNumber')),
'release_year': int_or_none(metadata.get('year')),
}

View File

@@ -87,6 +87,7 @@ class SRGSSRIE(InfoExtractor):
title = media_data['title']
formats = []
subtitles = {}
q = qualities(['SD', 'HD'])
for source in (media_data.get('resourceList') or []):
format_url = source.get('url')
@@ -104,12 +105,16 @@ class SRGSSRIE(InfoExtractor):
if source.get('tokenType') == 'AKAMAI':
format_url = self._get_tokenized_src(
format_url, media_id, format_id)
formats.extend(self._extract_akamai_formats(
format_url, media_id))
fmts, subs = self._extract_akamai_formats_and_subtitles(
format_url, media_id)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
elif protocol == 'HLS':
formats.extend(self._extract_m3u8_formats(
m3u8_fmts, m3u8_subs = self._extract_m3u8_formats_and_subtitles(
format_url, media_id, 'mp4', 'm3u8_native',
m3u8_id=format_id, fatal=False))
m3u8_id=format_id, fatal=False)
formats.extend(m3u8_fmts)
subtitles = self._merge_subtitles(subtitles, m3u8_subs)
elif protocol in ('HTTP', 'HTTPS'):
formats.append({
'format_id': format_id,
@@ -133,7 +138,6 @@ class SRGSSRIE(InfoExtractor):
})
self._sort_formats(formats)
subtitles = {}
if media_type == 'video':
for sub in (media_data.get('subtitleList') or []):
sub_url = sub.get('url')

View File

@@ -146,7 +146,7 @@ class SVTPlayIE(SVTPlayBaseIE):
)
(?P<svt_id>[^/?#&]+)|
https?://(?:www\.)?(?:svtplay|oppetarkiv)\.se/(?:video|klipp|kanaler)/(?P<id>[^/?#&]+)
(?:.*?modalId=(?P<modal_id>[\da-zA-Z-]+))?
(?:.*?(?:modalId|id)=(?P<modal_id>[\da-zA-Z-]+))?
)
'''
_TESTS = [{
@@ -177,6 +177,9 @@ class SVTPlayIE(SVTPlayBaseIE):
}, {
'url': 'https://www.svtplay.se/video/30479064/husdrommar/husdrommar-sasong-8-designdrommar-i-stenungsund?modalId=8zVbDPA',
'only_matching': True,
}, {
'url': 'https://www.svtplay.se/video/30684086/rapport/rapport-24-apr-18-00-7?id=e72gVpa',
'only_matching': True,
}, {
# geo restricted to Sweden
'url': 'http://www.oppetarkiv.se/video/5219710/trollflojten',
@@ -259,7 +262,7 @@ class SVTPlayIE(SVTPlayBaseIE):
if not svt_id:
svt_id = self._search_regex(
(r'<video[^>]+data-video-id=["\']([\da-zA-Z-]+)',
r'<[^>]+\bdata-rt=["\']top-area-play-button["\'][^>]+\bhref=["\'][^"\']*video/%s/[^"\']*\bmodalId=([\da-zA-Z-]+)' % re.escape(video_id),
r'<[^>]+\bdata-rt=["\']top-area-play-button["\'][^>]+\bhref=["\'][^"\']*video/%s/[^"\']*\b(?:modalId|id)=([\da-zA-Z-]+)' % re.escape(video_id),
r'["\']videoSvtId["\']\s*:\s*["\']([\da-zA-Z-]+)',
r'["\']videoSvtId\\?["\']\s*:\s*\\?["\']([\da-zA-Z-]+)',
r'"content"\s*:\s*{.*?"id"\s*:\s*"([\da-zA-Z-]+)"',

View File

@@ -99,16 +99,21 @@ class ThreeQSDNIE(InfoExtractor):
aspect = float_or_none(config.get('aspect'))
formats = []
subtitles = {}
for source_type, source in (config.get('sources') or {}).items():
if not source:
continue
if source_type == 'dash':
formats.extend(self._extract_mpd_formats(
source, video_id, mpd_id='mpd', fatal=False))
fmts, subs = self._extract_mpd_formats_and_subtitles(
source, video_id, mpd_id='mpd', fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
elif source_type == 'hls':
formats.extend(self._extract_m3u8_formats(
fmts, subs = self._extract_m3u8_formats_and_subtitles(
source, video_id, 'mp4', 'm3u8' if live else 'm3u8_native',
m3u8_id='hls', fatal=False))
m3u8_id='hls', fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
elif source_type == 'progressive':
for s in source:
src = s.get('src')
@@ -138,7 +143,6 @@ class ThreeQSDNIE(InfoExtractor):
# behaviour is being kept as-is
self._sort_formats(formats, ('res', 'source_preference'))
subtitles = {}
for subtitle in (config.get('subtitles') or []):
src = subtitle.get('src')
if not src:

View File

@@ -81,9 +81,13 @@ class TubiTvIE(InfoExtractor):
'http://tubitv.com/oz/videos/%s/content' % video_id, video_id)
title = video_data['title']
formats = self._extract_m3u8_formats(
self._proto_relative_url(video_data['url']),
video_id, 'mp4', 'm3u8_native')
formats = []
url = video_data['url']
# URL can be sometimes empty. Does this only happen when there is DRM?
if url:
formats = self._extract_m3u8_formats(
self._proto_relative_url(url),
video_id, 'mp4', 'm3u8_native')
self._sort_formats(formats)
thumbnails = []

View File

@@ -74,6 +74,12 @@ class TV2DKIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
entries = []
def add_entry(partner_id, kaltura_id):
entries.append(self.url_result(
'kaltura:%s:%s' % (partner_id, kaltura_id), 'Kaltura',
video_id=kaltura_id))
for video_el in re.findall(r'(?s)<[^>]+\bdata-entryid\s*=[^>]*>', webpage):
video = extract_attributes(video_el)
kaltura_id = video.get('data-entryid')
@@ -82,9 +88,14 @@ class TV2DKIE(InfoExtractor):
partner_id = video.get('data-partnerid')
if not partner_id:
continue
entries.append(self.url_result(
'kaltura:%s:%s' % (partner_id, kaltura_id), 'Kaltura',
video_id=kaltura_id))
add_entry(partner_id, kaltura_id)
if not entries:
kaltura_id = self._search_regex(
r'entry_id\s*:\s*["\']([0-9a-z_]+)', webpage, 'kaltura id')
partner_id = self._search_regex(
(r'\\u002Fp\\u002F(\d+)\\u002F', r'/p/(\d+)/'), webpage,
'partner id')
add_entry(partner_id, kaltura_id)
return self.playlist_result(entries)

View File

@@ -93,18 +93,31 @@ class TV4IE(InfoExtractor):
'device': 'browser',
'protocol': 'hls',
})['playbackItem']['manifestUrl']
formats = self._extract_m3u8_formats(
formats = []
subtitles = {}
fmts, subs = self._extract_m3u8_formats_and_subtitles(
manifest_url, video_id, 'mp4',
'm3u8_native', m3u8_id='hls', fatal=False)
formats.extend(self._extract_mpd_formats(
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
fmts, subs = self._extract_mpd_formats_and_subtitles(
manifest_url.replace('.m3u8', '.mpd'),
video_id, mpd_id='dash', fatal=False))
formats.extend(self._extract_f4m_formats(
video_id, mpd_id='dash', fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
fmts = self._extract_f4m_formats(
manifest_url.replace('.m3u8', '.f4m'),
video_id, f4m_id='hds', fatal=False))
formats.extend(self._extract_ism_formats(
video_id, f4m_id='hds', fatal=False)
formats.extend(fmts)
fmts, subs = self._extract_ism_formats_and_subtitles(
re.sub(r'\.ism/.*?\.m3u8', r'.ism/Manifest', manifest_url),
video_id, ism_id='mss', fatal=False))
video_id, ism_id='mss', fatal=False)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
if not formats and info.get('is_geo_restricted'):
self.raise_geo_restricted(countries=self._GEO_COUNTRIES, metadata_available=True)
@@ -115,7 +128,7 @@ class TV4IE(InfoExtractor):
'id': video_id,
'title': title,
'formats': formats,
# 'subtitles': subtitles,
'subtitles': subtitles,
'description': info.get('description'),
'timestamp': parse_iso8601(info.get('broadcast_date_time')),
'duration': int_or_none(info.get('duration')),

View File

@@ -9,7 +9,6 @@ from ..utils import (
int_or_none,
remove_start,
smuggle_url,
strip_or_none,
try_get,
)
@@ -45,32 +44,18 @@ class TVerIE(InfoExtractor):
query={'token': self._TOKEN})['main']
p_id = main['publisher_id']
service = remove_start(main['service'], 'ts_')
info = {
r_id = main['reference_id']
if service not in ('tx', 'russia2018', 'sebare2018live', 'gorin'):
r_id = 'ref:' + r_id
bc_url = smuggle_url(
self.BRIGHTCOVE_URL_TEMPLATE % (p_id, r_id),
{'geo_countries': ['JP']})
return {
'_type': 'url_transparent',
'description': try_get(main, lambda x: x['note'][0]['text'], compat_str),
'episode_number': int_or_none(try_get(main, lambda x: x['ext']['episode_number'])),
'url': bc_url,
'ie_key': 'BrightcoveNew',
}
if service == 'cx':
title = main['title']
subtitle = strip_or_none(main.get('subtitle'))
if subtitle:
title += ' - ' + subtitle
info.update({
'title': title,
'url': 'https://i.fod.fujitv.co.jp/plus7/web/%s/%s.html' % (p_id[:4], p_id),
'ie_key': 'FujiTVFODPlus7',
})
else:
r_id = main['reference_id']
if service not in ('tx', 'russia2018', 'sebare2018live', 'gorin'):
r_id = 'ref:' + r_id
bc_url = smuggle_url(
self.BRIGHTCOVE_URL_TEMPLATE % (p_id, r_id),
{'geo_countries': ['JP']})
info.update({
'url': bc_url,
'ie_key': 'BrightcoveNew',
})
return info

View File

@@ -19,6 +19,7 @@ from ..utils import (
strip_or_none,
unified_timestamp,
update_url_query,
url_or_none,
xpath_text,
)
@@ -36,9 +37,9 @@ class TwitterBaseIE(InfoExtractor):
def _extract_variant_formats(self, variant, video_id):
variant_url = variant.get('url')
if not variant_url:
return []
return [], {}
elif '.m3u8' in variant_url:
return self._extract_m3u8_formats(
return self._extract_m3u8_formats_and_subtitles(
variant_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False)
else:
@@ -49,22 +50,30 @@ class TwitterBaseIE(InfoExtractor):
'tbr': tbr,
}
self._search_dimensions_in_video_url(f, variant_url)
return [f]
return [f], {}
def _extract_formats_from_vmap_url(self, vmap_url, video_id):
vmap_url = url_or_none(vmap_url)
if not vmap_url:
return []
vmap_data = self._download_xml(vmap_url, video_id)
formats = []
subtitles = {}
urls = []
for video_variant in vmap_data.findall('.//{http://twitter.com/schema/videoVMapV2.xsd}videoVariant'):
video_variant.attrib['url'] = compat_urllib_parse_unquote(
video_variant.attrib['url'])
urls.append(video_variant.attrib['url'])
formats.extend(self._extract_variant_formats(
video_variant.attrib, video_id))
fmts, subs = self._extract_variant_formats(
video_variant.attrib, video_id)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
video_url = strip_or_none(xpath_text(vmap_data, './/MediaFile'))
if video_url not in urls:
formats.extend(self._extract_variant_formats({'url': video_url}, video_id))
return formats
fmts, subs = self._extract_variant_formats({'url': video_url}, video_id)
formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs)
return formats, subtitles
@staticmethod
def _search_dimensions_in_video_url(a_format, video_url):
@@ -471,8 +480,11 @@ class TwitterIE(TwitterBaseIE):
video_info = media.get('video_info') or {}
formats = []
subtitles = {}
for variant in video_info.get('variants', []):
formats.extend(self._extract_variant_formats(variant, twid))
fmts, subs = self._extract_variant_formats(variant, twid)
subtitles = self._merge_subtitles(subtitles, subs)
formats.extend(fmts)
self._sort_formats(formats)
thumbnails = []
@@ -491,6 +503,7 @@ class TwitterIE(TwitterBaseIE):
info.update({
'formats': formats,
'subtitles': subtitles,
'thumbnails': thumbnails,
'duration': float_or_none(video_info.get('duration_millis'), 1000),
})
@@ -540,7 +553,7 @@ class TwitterIE(TwitterBaseIE):
is_amplify = card_name == 'amplify'
vmap_url = get_binding_value('amplify_url_vmap') if is_amplify else get_binding_value('player_stream_url')
content_id = get_binding_value('%s_content_id' % (card_name if is_amplify else 'player'))
formats = self._extract_formats_from_vmap_url(vmap_url, content_id or twid)
formats, subtitles = self._extract_formats_from_vmap_url(vmap_url, content_id or twid)
self._sort_formats(formats)
thumbnails = []
@@ -558,6 +571,7 @@ class TwitterIE(TwitterBaseIE):
info.update({
'formats': formats,
'subtitles': subtitles,
'thumbnails': thumbnails,
'duration': int_or_none(get_binding_value(
'content_duration_seconds')),

View File

@@ -0,0 +1,72 @@
from __future__ import unicode_literals
from ..utils import (
unescapeHTML,
urljoin,
ExtractorError,
)
from .common import InfoExtractor
from .vimeo import VimeoIE
from .youtube import YoutubeIE
class UkColumnIE(InfoExtractor):
IE_NAME = 'ukcolumn'
_VALID_URL = r'(?i)https?://(?:www\.)?ukcolumn\.org(/index\.php)?/(?:video|ukcolumn-news)/(?P<id>[-a-z0-9]+)'
_TESTS = [{
'url': 'https://www.ukcolumn.org/ukcolumn-news/uk-column-news-28th-april-2021',
'info_dict': {
'id': '541632443',
'ext': 'mp4',
'title': 'UK Column News - 28th April 2021',
'uploader_id': 'ukcolumn',
'uploader': 'UK Column',
},
'add_ie': [VimeoIE.ie_key()],
'expected_warnings': ['Unable to download JSON metadata'],
'params': {
'skip_download': 'Handled by Vimeo',
},
}, {
'url': 'https://www.ukcolumn.org/video/insight-eu-military-unification',
'info_dict': {
'id': 'Fzbnb9t7XAw',
'ext': 'mp4',
'title': 'Insight: EU Military Unification',
'uploader_id': 'ukcolumn',
'description': 'md5:29a207965271af89baa0bc191f5de576',
'uploader': 'UK Column',
'upload_date': '20170514',
},
'add_ie': [YoutubeIE.ie_key()],
'params': {
'skip_download': 'Handled by Youtube',
},
}, {
'url': 'https://www.ukcolumn.org/index.php/ukcolumn-news/uk-column-news-30th-april-2021',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
oembed_url = urljoin(url, unescapeHTML(self._search_regex(
r'<iframe[^>]+src=(["\'])(?P<url>/media/oembed\?url=.+?)\1',
webpage, 'OEmbed URL', group='url')))
oembed_webpage = self._download_webpage(
oembed_url, display_id, note='Downloading OEmbed page')
ie, video_url = YoutubeIE, YoutubeIE._extract_url(oembed_webpage)
if not video_url:
ie, video_url = VimeoIE, VimeoIE._extract_url(url, oembed_webpage)
if not video_url:
raise ExtractorError('No embedded video found')
return {
'_type': 'url_transparent',
'title': self._og_search_title(webpage),
'url': video_url,
'ie_key': ie.ie_key(),
}

View File

@@ -30,7 +30,7 @@ class UplynkIE(InfoExtractor):
def _extract_uplynk_info(self, uplynk_content_url):
path, external_id, video_id, session_id = re.match(UplynkIE._VALID_URL, uplynk_content_url).groups()
display_id = video_id or external_id
formats = self._extract_m3u8_formats(
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
'http://content.uplynk.com/%s.m3u8' % path,
display_id, 'mp4', 'm3u8_native')
if session_id:
@@ -48,6 +48,7 @@ class UplynkIE(InfoExtractor):
'duration': float_or_none(asset.get('duration')),
'uploader_id': asset.get('owner'),
'formats': formats,
'subtitles': subtitles,
}
def _real_extract(self, url):

View File

@@ -69,19 +69,24 @@ class WatIE(InfoExtractor):
title = video_info['title']
formats = []
subtitles = {}
def extract_formats(manifest_urls):
for f, f_url in manifest_urls.items():
if not f_url:
continue
if f in ('dash', 'mpd'):
formats.extend(self._extract_mpd_formats(
fmts, subs = self._extract_mpd_formats_and_subtitles(
f_url.replace('://das-q1.tf1.fr/', '://das-q1-ssl.tf1.fr/'),
video_id, mpd_id='dash', fatal=False))
video_id, mpd_id='dash', fatal=False)
elif f == 'hls':
formats.extend(self._extract_m3u8_formats(
fmts, subs = self._extract_m3u8_formats_and_subtitles(
f_url, video_id, 'mp4',
'm3u8_native', m3u8_id='hls', fatal=False))
'm3u8_native', m3u8_id='hls', fatal=False)
else:
continue
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
delivery = video_data.get('delivery') or {}
extract_formats({delivery.get('format'): delivery.get('url')})
@@ -103,4 +108,5 @@ class WatIE(InfoExtractor):
video_data, lambda x: x['mediametrie']['chapters'][0]['estatS4'])),
'duration': int_or_none(video_info.get('duration')),
'formats': formats,
'subtitles': subtitles,
}

View File

@@ -0,0 +1,101 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
qualities,
try_get,
ExtractorError,
)
from ..compat import compat_str
class WhoWatchIE(InfoExtractor):
IE_NAME = 'whowatch'
_VALID_URL = r'https?://whowatch\.tv/viewer/(?P<id>\d+)'
_TESTS = [{
'url': 'https://whowatch.tv/viewer/21450171',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
self._download_webpage(url, video_id)
metadata = self._download_json('https://api.whowatch.tv/lives/%s' % video_id, video_id)
live_data = self._download_json('https://api.whowatch.tv/lives/%s/play' % video_id, video_id)
title = try_get(None, (
lambda x: live_data['share_info']['live_title'][1:-1],
lambda x: metadata['live']['title'],
), compat_str)
hls_url = live_data.get('hls_url')
if not hls_url:
raise ExtractorError(live_data.get('error_message') or 'The user is offline.', expected=True)
QUALITIES = qualities(['low', 'medium', 'high', 'veryhigh'])
formats = []
for i, fmt in enumerate(live_data.get('streams') or []):
name = fmt.get('quality') or fmt.get('name') or compat_str(i)
hls_url = fmt.get('hls_url')
rtmp_url = fmt.get('rtmp_url')
audio_only = fmt.get('audio_only')
quality = QUALITIES(fmt.get('quality'))
if hls_url:
hls_fmts = self._extract_m3u8_formats(
hls_url, video_id, ext='mp4', entry_protocol='m3u8',
m3u8_id='hls-%s' % name, quality=quality)
formats.extend(hls_fmts)
else:
hls_fmts = []
# RTMP url for audio_only is same as high format, so skip it
if rtmp_url and not audio_only:
formats.append({
'url': rtmp_url,
'format_id': 'rtmp-%s' % name,
'ext': 'mp4',
'protocol': 'rtmp_ffmpeg', # ffmpeg can, while rtmpdump can't
'vcodec': 'h264',
'acodec': 'aac',
'quality': quality,
'format_note': fmt.get('label'),
# note: HLS and RTMP have same resolution for now, so it's acceptable
'width': try_get(hls_fmts, lambda x: x[0]['width'], int),
'height': try_get(hls_fmts, lambda x: x[0]['height'], int),
})
# This contains the same formats as the above manifests and is used only as a fallback
formats.extend(self._extract_m3u8_formats(
hls_url, video_id, ext='mp4', entry_protocol='m3u8',
m3u8_id='hls'))
self._remove_duplicate_formats(formats)
self._sort_formats(formats)
uploader_url = try_get(metadata, lambda x: x['live']['user']['user_path'], compat_str)
if uploader_url:
uploader_url = 'https://whowatch.tv/profile/%s' % uploader_url
uploader_id = compat_str(try_get(metadata, lambda x: x['live']['user']['id'], int))
uploader = try_get(metadata, lambda x: x['live']['user']['name'], compat_str)
thumbnail = try_get(metadata, lambda x: x['live']['latest_thumbnail_url'], compat_str)
timestamp = int_or_none(try_get(metadata, lambda x: x['live']['started_at'], int), scale=1000)
view_count = try_get(metadata, lambda x: x['live']['total_view_count'], int)
comment_count = try_get(metadata, lambda x: x['live']['comment_count'], int)
return {
'id': video_id,
'title': title,
'uploader_id': uploader_id,
'uploader_url': uploader_url,
'uploader': uploader,
'formats': formats,
'thumbnail': thumbnail,
'timestamp': timestamp,
'view_count': view_count,
'comment_count': comment_count,
'is_live': True,
}

View File

@@ -58,6 +58,7 @@ class XFileShareIE(InfoExtractor):
(r'vidlocker\.xyz', 'VidLocker'),
(r'vidshare\.tv', 'VidShare'),
(r'vup\.to', 'VUp'),
(r'wolfstream\.tv', 'WolfStream'),
(r'xvideosharing\.com', 'XVideoSharing'),
)
@@ -82,6 +83,9 @@ class XFileShareIE(InfoExtractor):
}, {
'url': 'https://aparat.cam/n4d6dh0wvlpr',
'only_matching': True,
}, {
'url': 'https://wolfstream.tv/nthme29v9u2x',
'only_matching': True,
}]
@staticmethod

View File

@@ -11,6 +11,7 @@ from ..utils import (
parse_duration,
sanitized_Request,
str_to_int,
url_or_none,
)
@@ -71,10 +72,10 @@ class XTubeIE(InfoExtractor):
'Cookie': 'age_verified=1; cookiesAccepted=1',
})
title, thumbnail, duration = [None] * 3
title, thumbnail, duration, sources, media_definition = [None] * 5
config = self._parse_json(self._search_regex(
r'playerConf\s*=\s*({.+?})\s*,\s*(?:\n|loaderConf)', webpage, 'config',
r'playerConf\s*=\s*({.+?})\s*,\s*(?:\n|loaderConf|playerWrapper)', webpage, 'config',
default='{}'), video_id, transform_source=js_to_json, fatal=False)
if config:
config = config.get('mainRoll')
@@ -83,20 +84,52 @@ class XTubeIE(InfoExtractor):
thumbnail = config.get('poster')
duration = int_or_none(config.get('duration'))
sources = config.get('sources') or config.get('format')
media_definition = config.get('mediaDefinition')
if not isinstance(sources, dict):
if not isinstance(sources, dict) and not media_definition:
sources = self._parse_json(self._search_regex(
r'(["\'])?sources\1?\s*:\s*(?P<sources>{.+?}),',
webpage, 'sources', group='sources'), video_id,
transform_source=js_to_json)
formats = []
for format_id, format_url in sources.items():
formats.append({
'url': format_url,
'format_id': format_id,
'height': int_or_none(format_id),
})
format_urls = set()
if isinstance(sources, dict):
for format_id, format_url in sources.items():
format_url = url_or_none(format_url)
if not format_url:
continue
if format_url in format_urls:
continue
format_urls.add(format_url)
formats.append({
'url': format_url,
'format_id': format_id,
'height': int_or_none(format_id),
})
if isinstance(media_definition, list):
for media in media_definition:
video_url = url_or_none(media.get('videoUrl'))
if not video_url:
continue
if video_url in format_urls:
continue
format_urls.add(video_url)
format_id = media.get('format')
if format_id == 'hls':
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif format_id == 'mp4':
height = int_or_none(media.get('quality'))
formats.append({
'url': video_url,
'format_id': '%s-%d' % (format_id, height) if height else format_id,
'height': height,
})
self._remove_duplicate_formats(formats)
self._sort_formats(formats)

View File

@@ -68,7 +68,7 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
_RESERVED_NAMES = (
r'channel|c|user|playlist|watch|w|v|embed|e|watch_popup|'
r'movies|results|shared|hashtag|trending|feed|feeds|'
r'movies|results|shared|hashtag|trending|feed|feeds|oembed|'
r'storefront|oops|index|account|reporthistory|t/terms|about|upload|signin|logout')
_NETRC_MACHINE = 'youtube'
@@ -2958,12 +2958,19 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
return
# video attachment
video_renderer = try_get(
post_renderer, lambda x: x['backstageAttachment']['videoRenderer'], dict)
video_id = None
if video_renderer:
entry = self._video_entry(video_renderer)
post_renderer, lambda x: x['backstageAttachment']['videoRenderer'], dict) or {}
video_id = video_renderer.get('videoId')
if video_id:
entry = self._extract_video(video_renderer)
if entry:
yield entry
# playlist attachment
playlist_id = try_get(
post_renderer, lambda x: x['backstageAttachment']['playlistRenderer']['playlistId'], compat_str)
if playlist_id:
yield self.url_result(
'https://www.youtube.com/playlist?list=%s' % playlist_id,
ie=YoutubeTabIE.ie_key(), video_id=playlist_id)
# inline video links
runs = try_get(post_renderer, lambda x: x['contentText']['runs'], list) or []
for run in runs:
@@ -2978,7 +2985,7 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
ep_video_id = YoutubeIE._match_id(ep_url)
if video_id == ep_video_id:
continue
yield self.url_result(ep_url, ie=YoutubeIE.ie_key(), video_id=video_id)
yield self.url_result(ep_url, ie=YoutubeIE.ie_key(), video_id=ep_video_id)
def _post_thread_continuation_entries(self, post_thread_continuation):
contents = post_thread_continuation.get('contents')
@@ -3474,11 +3481,12 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
item_id = self._match_id(url)
url = compat_urlparse.urlunparse(
compat_urlparse.urlparse(url)._replace(netloc='www.youtube.com'))
compat_opts = self._downloader.params.get('compat_opts', [])
# This is not matched in a channel page with a tab selected
mobj = re.match(r'(?P<pre>%s)(?P<post>/?(?![^#?]).*$)' % self._VALID_URL, url)
mobj = mobj.groupdict() if mobj else {}
if mobj and not mobj.get('not_channel'):
if mobj and not mobj.get('not_channel') and 'no-youtube-channel-redirect' not in compat_opts:
self.report_warning(
'A channel/user page was given. All the channel\'s videos will be downloaded. '
'To download only the videos in the home page, add a "/featured" to the URL')
@@ -3506,7 +3514,8 @@ class YoutubeTabIE(YoutubeBaseInfoExtractor):
webpage, data = self._extract_webpage(url, item_id)
# YouTube sometimes provides a button to reload playlist with unavailable videos.
data = self._reload_with_unavailable_videos(item_id, data, webpage) or data
if 'no-youtube-unavailable-videos' not in compat_opts:
data = self._reload_with_unavailable_videos(item_id, data, webpage) or data
tabs = try_get(
data, lambda x: x['contents']['twoColumnBrowseResultsRenderer']['tabs'], list)

View File

@@ -37,7 +37,7 @@ class Zee5IE(InfoExtractor):
'title': 'Krishna - The Birth',
'duration': 4368,
'average_rating': 4,
'description': str,
'description': compat_str,
'alt_title': 'Krishna - The Birth',
'uploader': 'Zee Entertainment Enterprises Ltd',
'release_date': '20060101',
@@ -58,7 +58,7 @@ class Zee5IE(InfoExtractor):
'title': 'Episode 1 - The Test Of Bramha',
'duration': 1336,
'average_rating': 4,
'description': str,
'description': compat_str,
'alt_title': 'Episode 1 - The Test Of Bramha',
'uploader': 'Green Gold',
'release_date': '20090101',
@@ -95,16 +95,16 @@ class Zee5IE(InfoExtractor):
m3u8_url = try_get(
json_data,
(lambda x: x['hls'][0], lambda x: x['video_details']['hls_url']),
str)
compat_str)
formats = self._extract_m3u8_formats(
'https://zee5vodnd.akamaized.net' + m3u8_url.replace('/drm1/', '/hls1/') + token_request['video_token'],
'https://zee5vodnd.akamaized.net' + m3u8_url.replace('/drm', '/hls', 1) + token_request['video_token'],
video_id, fatal=False)
mpd_url = try_get(
json_data,
(lambda x: x['video'][0], lambda x: x['video_details']['url']),
str)
compat_str)
formats += self._extract_mpd_formats(
'https://zee5vodnd.akamaized.net' + mpd_url + token_request['video_token'],
'https://zee5vod.akamaized.net' + mpd_url,
video_id, fatal=False)
self._sort_formats(formats)

View File

@@ -165,7 +165,7 @@ def parseOpts(overrideArguments=None):
help='Update this program to latest version. Make sure that you have sufficient permissions (run with sudo if needed)')
general.add_option(
'-i', '--ignore-errors', '--no-abort-on-error',
action='store_true', dest='ignoreerrors', default=True,
action='store_true', dest='ignoreerrors', default=None,
help='Continue on download errors, for example to skip unavailable videos in a playlist (default) (Alias: --no-abort-on-error)')
general.add_option(
'--abort-on-error', '--no-ignore-errors',
@@ -229,6 +229,14 @@ def parseOpts(overrideArguments=None):
'--no-colors',
action='store_true', dest='no_color', default=False,
help='Do not emit color codes in output')
general.add_option(
'--compat-options',
metavar='OPTS', dest='compat_opts', default=[],
action='callback', callback=_comma_separated_values_options_callback, type='str',
help=(
'Options that can help keep compatibility with youtube-dl and youtube-dlc '
'configurations by reverting some of the changes made in yt-dlp. '
'See "Differences in default behavior" for details'))
network = optparse.OptionGroup(parser, 'Network Options')
network.add_option(
@@ -474,7 +482,7 @@ def parseOpts(overrideArguments=None):
'see "Sorting Formats" for more details'))
video_format.add_option(
'--video-multistreams',
action='store_true', dest='allow_multiple_video_streams', default=False,
action='store_true', dest='allow_multiple_video_streams', default=None,
help='Allow multiple video streams to be merged into a single file')
video_format.add_option(
'--no-video-multistreams',
@@ -482,7 +490,7 @@ def parseOpts(overrideArguments=None):
help='Only one video stream is downloaded for each output file (default)')
video_format.add_option(
'--audio-multistreams',
action='store_true', dest='allow_multiple_audio_streams', default=False,
action='store_true', dest='allow_multiple_audio_streams', default=None,
help='Allow multiple audio streams to be merged into a single file')
video_format.add_option(
'--no-audio-multistreams',
@@ -502,6 +510,10 @@ def parseOpts(overrideArguments=None):
'--no-prefer-free-formats',
action='store_true', dest='prefer_free_formats', default=False,
help="Don't give any special preference to free containers (default)")
video_format.add_option(
'--check-formats',
action='store_true', dest='check_formats', default=False,
help="Check that the formats selected are actually downloadable (Experimental)")
video_format.add_option(
'-F', '--list-formats',
action='store_true', dest='listformats',
@@ -509,11 +521,11 @@ def parseOpts(overrideArguments=None):
video_format.add_option(
'--list-formats-as-table',
action='store_true', dest='listformats_table', default=True,
help='Present the output of -F in tabular form (default)')
help=optparse.SUPPRESS_HELP)
video_format.add_option(
'--list-formats-old', '--no-list-formats-as-table',
action='store_false', dest='listformats_table',
help='Present the output of -F in the old form (Alias: --no-list-formats-as-table)')
help=optparse.SUPPRESS_HELP)
video_format.add_option(
'--merge-output-format',
action='store', dest='merge_output_format', metavar='FORMAT', default=None,
@@ -669,7 +681,9 @@ def parseOpts(overrideArguments=None):
action='callback', callback=_dict_from_multiple_values_options_callback,
callback_kwargs={
'allowed_keys': 'http|ftp|m3u8|dash|rtsp|rtmp|mms',
'default_key': 'default', 'process': lambda x: x.strip()},
'default_key': 'default',
'process': lambda x: x.strip()
},
help=(
'Name or path of the external downloader to use (optionally) prefixed by '
'the protocols (http, ftp, m3u8, dash, rstp, rtmp, mms) to use it for. '
@@ -684,7 +698,9 @@ def parseOpts(overrideArguments=None):
action='callback', callback=_dict_from_multiple_values_options_callback,
callback_kwargs={
'allowed_keys': '|'.join(list_external_downloaders()),
'default_key': 'default', 'process': compat_shlex_split},
'default_key': 'default',
'process': compat_shlex_split
},
help=(
'Give these arguments to the external downloader. '
'Specify the downloader name and the arguments separated by a colon ":". '
@@ -878,9 +894,7 @@ def parseOpts(overrideArguments=None):
'-P', '--paths',
metavar='TYPES:PATH', dest='paths', default={}, type='str',
action='callback', callback=_dict_from_multiple_values_options_callback,
callback_kwargs={
'allowed_keys': 'home|temp|%s' % '|'.join(OUTTMPL_TYPES.keys()),
'process': lambda x: x.strip()},
callback_kwargs={'allowed_keys': 'home|temp|%s' % '|'.join(OUTTMPL_TYPES.keys())},
help=(
'The paths where the files should be downloaded. '
'Specify the type of file and the path separated by a colon ":". '
@@ -895,7 +909,8 @@ def parseOpts(overrideArguments=None):
action='callback', callback=_dict_from_multiple_values_options_callback,
callback_kwargs={
'allowed_keys': '|'.join(OUTTMPL_TYPES.keys()),
'default_key': 'default', 'process': lambda x: x.strip()},
'default_key': 'default'
},
help='Output filename template; see "OUTPUT TEMPLATE" for details')
filesystem.add_option(
'--output-na-placeholder',
@@ -930,15 +945,15 @@ def parseOpts(overrideArguments=None):
dest='trim_file_name', default=0, type=int,
help='Limit the filename length (excluding extension) to the specified number of characters')
filesystem.add_option(
'-A', '--auto-number',
'--auto-number',
action='store_true', dest='autonumber', default=False,
help=optparse.SUPPRESS_HELP)
filesystem.add_option(
'-t', '--title',
'--title',
action='store_true', dest='usetitle', default=False,
help=optparse.SUPPRESS_HELP)
filesystem.add_option(
'-l', '--literal', default=False,
'--literal', default=False,
action='store_true', dest='usetitle',
help=optparse.SUPPRESS_HELP)
filesystem.add_option(
@@ -1005,7 +1020,7 @@ def parseOpts(overrideArguments=None):
help='Do not write video annotations (default)')
filesystem.add_option(
'--write-playlist-metafiles',
action='store_true', dest='allow_playlist_files', default=True,
action='store_true', dest='allow_playlist_files', default=None,
help=(
'Write playlist metadata in addition to the video metadata '
'when using --write-info-json, --write-description etc. (default)'))
@@ -1120,7 +1135,9 @@ def parseOpts(overrideArguments=None):
action='callback', callback=_dict_from_multiple_values_options_callback,
callback_kwargs={
'allowed_keys': r'\w+(?:\+\w+)?', 'default_key': 'default-compat',
'process': compat_shlex_split, 'multiple_keys': False},
'process': compat_shlex_split,
'multiple_keys': False
},
help=(
'Give these arguments to the postprocessors. '
'Specify the postprocessor/executable name and the arguments separated by a colon ":" '

View File

@@ -59,7 +59,7 @@ class EmbedThumbnailPP(FFmpegPostProcessor):
original_thumbnail = thumbnail_filename = info['thumbnails'][-1]['filepath']
# Convert unsupported thumbnail formats to JPEG (see #25687, #25717)
_, thumbnail_ext = os.path.splitext(thumbnail_filename)
thumbnail_ext = os.path.splitext(thumbnail_filename)[1][1:]
if thumbnail_ext not in ('jpg', 'png'):
thumbnail_filename = convertor.convert_thumbnail(thumbnail_filename, 'jpg')
thumbnail_ext = 'jpg'

View File

@@ -24,7 +24,7 @@ class ExecAfterDownloadPP(PostProcessor):
def parse_cmd(self, cmd, info):
# If no %(key)s is found, replace {} for backard compatibility
if not re.search(FORMAT_RE.format(r'[-\w>.+]+'), cmd):
if not re.search(FORMAT_RE.format(r'[^)]*'), cmd):
if '{}' not in cmd:
cmd += ' {}'
return cmd.replace('{}', compat_shlex_quote(info['filepath']))

View File

@@ -50,7 +50,7 @@ def update_self(to_screen, verbose, opener):
return h.hexdigest()
if not isinstance(globals().get('__loader__'), zipimporter) and not hasattr(sys, 'frozen'):
to_screen('It looks like you installed yt-dlp with a package manager, pip, setup.py or a tarball. Please use that to update.')
to_screen('It looks like you installed yt-dlp with a package manager, pip, setup.py or a tarball. Please use that to update')
return
# sys.executable is set to the full pathname of the exe-file for py2exe
@@ -66,7 +66,7 @@ def update_self(to_screen, verbose, opener):
except Exception:
if verbose:
to_screen(encode_compat_str(traceback.format_exc()))
to_screen('ERROR: can\'t obtain versions info. Please try again later.')
to_screen('ERROR: can\'t obtain versions info. Please try again later')
to_screen('Visit https://github.com/yt-dlp/yt-dlp/releases/latest')
return
@@ -165,7 +165,7 @@ def update_self(to_screen, verbose, opener):
echo.Waiting for file handle to be closed ...
ping 127.0.0.1 -n 5 -w 1000 > NUL
move /Y "%s.new" "%s" > NUL
echo.Updated yt-dlp to version %s.
echo.Updated yt-dlp to version %s
)
@start /b "" cmd /c del "%%~f0"&exit /b
''' % (exe, exe, version_id))
@@ -212,7 +212,7 @@ def update_self(to_screen, verbose, opener):
to_screen('ERROR: unable to overwrite current version')
return
to_screen('Updated yt-dlp. Restart yt-dlp to use the new version.')
to_screen('Updated yt-dlp. Restart yt-dlp to use the new version')
''' # UNUSED

View File

@@ -2166,7 +2166,7 @@ def sanitize_url(url):
for mistake, fixup in COMMON_TYPOS:
if re.match(mistake, url):
return re.sub(mistake, fixup, url)
return url
return escape_url(url)
def sanitized_Request(url, *args, **kwargs):
@@ -2340,15 +2340,20 @@ def make_HTTPS_handler(params, **kwargs):
return YoutubeDLHTTPSHandler(params, context=context, **kwargs)
def bug_reports_message():
def bug_reports_message(before=';'):
if ytdl_is_updateable():
update_cmd = 'type yt-dlp -U to update'
else:
update_cmd = 'see https://github.com/yt-dlp/yt-dlp on how to update'
msg = '; please report this issue on https://github.com/yt-dlp/yt-dlp .'
msg = 'please report this issue on https://github.com/yt-dlp/yt-dlp .'
msg += ' Make sure you are using the latest version; %s.' % update_cmd
msg += ' Be sure to call yt-dlp with the --verbose flag and include its complete output.'
return msg
before = before.rstrip()
if not before or before.endswith(('.', '!', '?')):
msg = msg[0].title() + msg[1:]
return (before + ' ' if before else '') + msg
class YoutubeDLError(Exception):
@@ -2356,6 +2361,12 @@ class YoutubeDLError(Exception):
pass
network_exceptions = [compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error]
if hasattr(ssl, 'CertificateError'):
network_exceptions.append(ssl.CertificateError)
network_exceptions = tuple(network_exceptions)
class ExtractorError(YoutubeDLError):
"""Error during info extraction."""
@@ -2364,7 +2375,7 @@ class ExtractorError(YoutubeDLError):
If expected is set, this is a normal error message and most likely not a bug in yt-dlp.
"""
if sys.exc_info()[0] in (compat_urllib_error.URLError, socket.timeout, UnavailableVideoError):
if sys.exc_info()[0] in network_exceptions:
expected = True
if video_id is not None:
msg = video_id + ': ' + msg
@@ -6070,7 +6081,7 @@ def get_executable_path():
return os.path.abspath(path)
def load_plugins(name, type, namespace):
def load_plugins(name, suffix, namespace):
plugin_info = [None]
classes = []
try:
@@ -6078,7 +6089,9 @@ def load_plugins(name, type, namespace):
name, [os.path.join(get_executable_path(), 'ytdlp_plugins')])
plugins = imp.load_module(name, *plugin_info)
for name in dir(plugins):
if not name.endswith(type):
if name in namespace:
continue
if not name.endswith(suffix):
continue
klass = getattr(plugins, name)
classes.append(klass)
@@ -6101,11 +6114,11 @@ def traverse_dict(dictn, keys, casesense=True):
key = key.lower()
dictn = dictn.get(key)
elif isinstance(dictn, (list, tuple, compat_str)):
key, n = int_or_none(key), len(dictn)
if key is not None and -n <= key < n:
dictn = dictn[key]
if ':' in key:
key = slice(*map(int_or_none, key.split(':')))
else:
dictn = None
key = int_or_none(key)
dictn = try_get(dictn, lambda x: x[key])
else:
return None
return dictn

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2021.04.11'
__version__ = '2021.04.22'

378
yt_dlp/webvtt.py Normal file
View File

@@ -0,0 +1,378 @@
# coding: utf-8
from __future__ import unicode_literals, print_function, division
"""
A partial parser for WebVTT segments. Interprets enough of the WebVTT stream
to be able to assemble a single stand-alone subtitle file, suitably adjusting
timestamps on the way, while everything else is passed through unmodified.
Regular expressions based on the W3C WebVTT specification
<https://www.w3.org/TR/webvtt1/>. The X-TIMESTAMP-MAP extension is described
in RFC 8216 §3.5 <https://tools.ietf.org/html/rfc8216#section-3.5>.
"""
import re
import io
from .utils import int_or_none
from .compat import (
compat_str as str,
compat_Pattern,
compat_Match,
)
class _MatchParser(object):
"""
An object that maintains the current parsing position and allows
conveniently advancing it as syntax elements are successfully parsed.
"""
def __init__(self, string):
self._data = string
self._pos = 0
def match(self, r):
if isinstance(r, compat_Pattern):
return r.match(self._data, self._pos)
if isinstance(r, str):
if self._data.startswith(r, self._pos):
return len(r)
return None
raise ValueError(r)
def advance(self, by):
if by is None:
amt = 0
elif isinstance(by, compat_Match):
amt = len(by.group(0))
elif isinstance(by, str):
amt = len(by)
elif isinstance(by, int):
amt = by
else:
raise ValueError(by)
self._pos += amt
return by
def consume(self, r):
return self.advance(self.match(r))
def child(self):
return _MatchChildParser(self)
class _MatchChildParser(_MatchParser):
"""
A child parser state, which advances through the same data as
its parent, but has an independent position. This is useful when
advancing through syntax elements we might later want to backtrack
from.
"""
def __init__(self, parent):
super(_MatchChildParser, self).__init__(parent._data)
self.__parent = parent
self._pos = parent._pos
def commit(self):
"""
Advance the parent state to the current position of this child state.
"""
self.__parent._pos = self._pos
return self.__parent
class ParseError(Exception):
def __init__(self, parser):
super(ParseError, self).__init__("Parse error at position %u (near %r)" % (
parser._pos, parser._data[parser._pos:parser._pos + 20]
))
_REGEX_TS = re.compile(r'''(?x)
(?:([0-9]{2,}):)?
([0-9]{2}):
([0-9]{2})\.
([0-9]{3})?
''')
_REGEX_EOF = re.compile(r'\Z')
_REGEX_NL = re.compile(r'(?:\r\n|[\r\n])')
_REGEX_BLANK = re.compile(r'(?:\r\n|[\r\n])+')
def _parse_ts(ts):
"""
Convert a parsed WebVTT timestamp (a re.Match obtained from _REGEX_TS)
into an MPEG PES timestamp: a tick counter at 90 kHz resolution.
"""
h, min, s, ms = ts.groups()
return 90 * (
int(h or 0) * 3600000 + # noqa: W504,E221,E222
int(min) * 60000 + # noqa: W504,E221,E222
int(s) * 1000 + # noqa: W504,E221,E222
int(ms) # noqa: W504,E221,E222
)
def _format_ts(ts):
"""
Convert an MPEG PES timestamp into a WebVTT timestamp.
This will lose sub-millisecond precision.
"""
ts = int((ts + 45) // 90)
ms , ts = divmod(ts, 1000) # noqa: W504,E221,E222,E203
s , ts = divmod(ts, 60) # noqa: W504,E221,E222,E203
min, h = divmod(ts, 60) # noqa: W504,E221,E222
return '%02u:%02u:%02u.%03u' % (h, min, s, ms)
class Block(object):
"""
An abstract WebVTT block.
"""
def __init__(self, **kwargs):
for key, val in kwargs.items():
setattr(self, key, val)
@classmethod
def parse(cls, parser):
m = parser.match(cls._REGEX)
if not m:
return None
parser.advance(m)
return cls(raw=m.group(0))
def write_into(self, stream):
stream.write(self.raw)
class HeaderBlock(Block):
"""
A WebVTT block that may only appear in the header part of the file,
i.e. before any cue blocks.
"""
pass
class Magic(HeaderBlock):
_REGEX = re.compile(r'\ufeff?WEBVTT([ \t][^\r\n]*)?(?:\r\n|[\r\n])')
# XXX: The X-TIMESTAMP-MAP extension is described in RFC 8216 §3.5
# <https://tools.ietf.org/html/rfc8216#section-3.5>, but the RFC
# doesnt specify the exact grammar nor where in the WebVTT
# syntax it should be placed; the below has been devised based
# on usage in the wild
#
# And strictly speaking, the presence of this extension violates
# the W3C WebVTT spec. Oh well.
_REGEX_TSMAP = re.compile(r'X-TIMESTAMP-MAP=')
_REGEX_TSMAP_LOCAL = re.compile(r'LOCAL:')
_REGEX_TSMAP_MPEGTS = re.compile(r'MPEGTS:([0-9]+)')
@classmethod
def __parse_tsmap(cls, parser):
parser = parser.child()
while True:
m = parser.consume(cls._REGEX_TSMAP_LOCAL)
if m:
m = parser.consume(_REGEX_TS)
if m is None:
raise ParseError(parser)
local = _parse_ts(m)
if local is None:
raise ParseError(parser)
else:
m = parser.consume(cls._REGEX_TSMAP_MPEGTS)
if m:
mpegts = int_or_none(m.group(1))
if mpegts is None:
raise ParseError(parser)
else:
raise ParseError(parser)
if parser.consume(','):
continue
if parser.consume(_REGEX_NL):
break
raise ParseError(parser)
parser.commit()
return local, mpegts
@classmethod
def parse(cls, parser):
parser = parser.child()
m = parser.consume(cls._REGEX)
if not m:
raise ParseError(parser)
extra = m.group(1)
local, mpegts = None, None
if parser.consume(cls._REGEX_TSMAP):
local, mpegts = cls.__parse_tsmap(parser)
if not parser.consume(_REGEX_NL):
raise ParseError(parser)
parser.commit()
return cls(extra=extra, mpegts=mpegts, local=local)
def write_into(self, stream):
stream.write('WEBVTT')
if self.extra is not None:
stream.write(self.extra)
stream.write('\n')
if self.local or self.mpegts:
stream.write('X-TIMESTAMP-MAP=LOCAL:')
stream.write(_format_ts(self.local if self.local is not None else 0))
stream.write(',MPEGTS:')
stream.write(str(self.mpegts if self.mpegts is not None else 0))
stream.write('\n')
stream.write('\n')
class StyleBlock(HeaderBlock):
_REGEX = re.compile(r'''(?x)
STYLE[\ \t]*(?:\r\n|[\r\n])
((?:(?!-->)[^\r\n])+(?:\r\n|[\r\n]))*
(?:\r\n|[\r\n])
''')
class RegionBlock(HeaderBlock):
_REGEX = re.compile(r'''(?x)
REGION[\ \t]*
((?:(?!-->)[^\r\n])+(?:\r\n|[\r\n]))*
(?:\r\n|[\r\n])
''')
class CommentBlock(Block):
_REGEX = re.compile(r'''(?x)
NOTE(?:\r\n|[\ \t\r\n])
((?:(?!-->)[^\r\n])+(?:\r\n|[\r\n]))*
(?:\r\n|[\r\n])
''')
class CueBlock(Block):
"""
A cue block. The payload is not interpreted.
"""
_REGEX_ID = re.compile(r'((?:(?!-->)[^\r\n])+)(?:\r\n|[\r\n])')
_REGEX_ARROW = re.compile(r'[ \t]+-->[ \t]+')
_REGEX_SETTINGS = re.compile(r'[ \t]+((?:(?!-->)[^\r\n])+)')
_REGEX_PAYLOAD = re.compile(r'[^\r\n]+(?:\r\n|[\r\n])?')
@classmethod
def parse(cls, parser):
parser = parser.child()
id = None
m = parser.consume(cls._REGEX_ID)
if m:
id = m.group(1)
m0 = parser.consume(_REGEX_TS)
if not m0:
return None
if not parser.consume(cls._REGEX_ARROW):
return None
m1 = parser.consume(_REGEX_TS)
if not m1:
return None
m2 = parser.consume(cls._REGEX_SETTINGS)
if not parser.consume(_REGEX_NL):
return None
start = _parse_ts(m0)
end = _parse_ts(m1)
settings = m2.group(1) if m2 is not None else None
text = io.StringIO()
while True:
m = parser.consume(cls._REGEX_PAYLOAD)
if not m:
break
text.write(m.group(0))
parser.commit()
return cls(
id=id,
start=start, end=end, settings=settings,
text=text.getvalue()
)
def write_into(self, stream):
if self.id is not None:
stream.write(self.id)
stream.write('\n')
stream.write(_format_ts(self.start))
stream.write(' --> ')
stream.write(_format_ts(self.end))
if self.settings is not None:
stream.write(' ')
stream.write(self.settings)
stream.write('\n')
stream.write(self.text)
stream.write('\n')
@property
def as_json(self):
return {
'id': self.id,
'start': self.start,
'end': self.end,
'text': self.text,
'settings': self.settings,
}
def parse_fragment(frag_content):
"""
A generator that yields (partially) parsed WebVTT blocks when given
a bytes object containing the raw contents of a WebVTT file.
"""
parser = _MatchParser(frag_content.decode('utf-8'))
yield Magic.parse(parser)
while not parser.match(_REGEX_EOF):
if parser.consume(_REGEX_BLANK):
continue
block = RegionBlock.parse(parser)
if block:
yield block
continue
block = StyleBlock.parse(parser)
if block:
yield block
continue
block = CommentBlock.parse(parser)
if block:
yield block # XXX: or skip
continue
break
while not parser.match(_REGEX_EOF):
if parser.consume(_REGEX_BLANK):
continue
block = CommentBlock.parse(parser)
if block:
yield block # XXX: or skip
continue
block = CueBlock.parse(parser)
if block:
yield block
continue
raise ParseError(parser)