1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2026-01-12 01:41:26 +00:00

Compare commits

...

66 Commits

Author SHA1 Message Date
pukkandan
c25228e5da Release 2021.02.04 2021-02-05 04:50:38 +05:30
pukkandan
de6000d913 Multiple output templates for different file types
Syntax: -o common_template -o type:type_template
Types supported: subtitle|thumbnail|description|annotation|infojson|pl_description|pl_infojson
2021-02-05 04:11:39 +05:30
pukkandan
ff88a05cff [pyinst] Automatically detect python architecture and working directory
:ci skip all
2021-02-04 22:09:10 +05:30
pukkandan
8a784c74d1 [linter] youtube.py 2021-02-04 20:29:25 +05:30
pukkandan
545cc85d11 [youtube] Update to ytdl-2021.02.04.1 2021-02-04 20:07:17 +05:30
pukkandan
c10d0213fc [FormatSort] fix bug where quality had more priority than hasvid 2021-02-04 19:42:14 +05:30
pukkandan
2181983a0c Update to ytdl-2021.02.04.1 except youtube 2021-02-04 13:26:22 +05:30
pukkandan
e29663c644 #45 Allow date/time formatting in output template
Closes #43
2021-02-03 02:45:00 +05:30
pukkandan
9c3fe2ef80 [youtube_live_chat] Fix URL
Bug introduced by 82e3f6ebda

:ci skip dl
2021-02-03 02:22:27 +05:30
pukkandan
b60419c51a [youtube] More metadata extraction for channels/playlists 2021-02-02 21:51:32 +05:30
pukkandan
18590cecdb Strip out internal fields such as _filename from infojson (Closes #42)
:ci skip dl
2021-02-02 03:19:21 +05:30
pukkandan
9f888147de [FormatSort] Allow user to prefer av01 over vp9
The default is still vp9
2021-02-02 03:19:21 +05:30
pukkandan
e8be92f9d6 Fix "Default format spec" appearing in quiet mode 2021-02-02 03:19:21 +05:30
pukkandan
b9d973bef1 Fix issue with overwriting files 2021-02-02 03:19:21 +05:30
pukkandan
c55256c5a3 [audius] Fix extractor 2021-02-01 15:03:59 +05:30
pukkandan
82e3f6ebda [youtube_live_chat] Fix parse_yt_initial_data and add fragment_retries
:ci skip dl
2021-01-31 20:52:43 +05:30
pukkandan
af819c216f [postprocessor] Raise errors correctly
Previously, when a postprocessor reported error, the download was still considered a success. This causes issues especially with critical PPs like Merger, MoveFiles etc

:ci skip dl
2021-01-30 18:07:21 +05:30
pukkandan
e3b771a898 fix typos :ci skip dl 2021-01-30 16:49:58 +05:30
pukkandan
cac96421d9 New option --no-write-playlist-metafiles to NOT write playlist metadata files 2021-01-30 16:43:20 +05:30
pukkandan
7c245ce877 [metadatafromtitle] Fix bug when extracting data from numeric fields
:ci skip dl
2021-01-30 14:36:10 +05:30
pukkandan
eabce90175 [version] update
:ci skip dl
2021-01-29 23:42:28 +05:30
pukkandan
29b6000e35 Release 2021.01.29 2021-01-29 23:25:18 +05:30
pukkandan
e38df8f9fa Refactor update-version, pyinst.py and related files
* Refactor update-version
* Moved pyinst, update-version and icon into devscripts
* pyinst doesn't bump version anymore
* Merge pyinst and pyinst32. Usage: `pyinst.py [32|64]`
* Add mutagen as requirement
* Remove make_win and related files
2021-01-29 23:16:00 +05:30
pukkandan
caa15a7b57 [Audius] Add extractor (Closes #40)
Related: https://github.com/ytdl-org/youtube-dl/pull/27360
Related: https://github.com/ytdl-org/youtube-dl/issues/24216

Direct API URLs are not currently supported. See https://github.com/ytdl-org/youtube-dl/pull/27360#issuecomment-757123708 for details

Co-authored by: qulas
2021-01-29 22:30:22 +05:30
pukkandan
105b0b700e Populate "playlist_*" fields for setting playlist metadata filename
Related: #36
2021-01-29 01:57:14 +05:30
pukkandan
66c935fb16 Linter and misc cleanup
:ci skip dl
2021-01-29 01:03:32 +05:30
pukkandan
64c0d954e5 [youtube] Extract playlist description 2021-01-29 00:31:50 +05:30
pukkandan
bf330f5f29 [anvato] Workaround for anvato_token_generator import failing (Closes #35)
:ci skip dl
2021-01-28 15:57:37 +05:30
pukkandan
f6d7624f57 Partial solution for detecting existing files correctly even when extracting audio
* Does not work when audio format is 'best'
2021-01-28 15:50:03 +05:30
pukkandan
ece8a2a1b6 [embedthumbnail] Fix for missing output filename for ffmpeg call (Closes #38) 2021-01-28 15:48:33 +05:30
Bepis
8d0ea5f955 [Youtube] Improve comment API requests
co-authored by bbepis
2021-01-28 11:49:31 +05:30
pukkandan
0748b3317b Seperate import of lazy_extractors from that of normal extractors
This prevents "ModuleNotFoundError: No module named 'youtube_dl.extractor.lazy_extractors'" from appearing in the traceback

Related: https://github.com/animelover1984/youtube-dl/issues/17#issuecomment-757945024
2021-01-28 11:25:42 +05:30
pukkandan
6b591b2925 Detect existing files correctly even when there is remux/recode
:ci skip dl
2021-01-28 10:49:37 +05:30
pukkandan
179122495b [ffmpeg] Document more formats that are supported for remux/recode 2021-01-28 10:36:34 +05:30
pukkandan
02fd60d305 Write playlist description to file (Closes #36)
:ci skip dl
2021-01-28 06:25:18 +05:30
pukkandan
06167fbbd3 #31 Features from animelover1984/youtube-dl
* Add `--get-comments`
* [youtube] Extract comments
* [billibilli] Added BiliBiliSearchIE, BilibiliChannelIE
* [billibilli] Extract comments
* [billibilli] Better video extraction
* Write playlist data to infojson
* [FFmpegMetadata] Embed infojson inside the video
* [EmbedThumbnail] Try embedding in mp4 using ffprobe and `-disposition`
* [EmbedThumbnail] Treat mka like mkv and mov like mp4
* [EmbedThumbnail] Embed in ogg/opus
* [VideoRemuxer] Conditionally remux video
* [VideoRemuxer] Add `-movflags +faststart` when remuxing from mp4
* [ffmpeg] Print entire stderr in verbose when there is error
* [EmbedSubtitle] Warn when embedding ass in mp4
* [avanto] Use NFLTokenGenerator if possible
2021-01-27 20:32:51 +05:30
pukkandan
4ff5e98991 More badges
:ci skip all
2021-01-27 20:16:34 +05:30
pukkandan
e4172ac903 Deprecate avconv/avprobe
All current functionality is left untouched. But don't expect any new features to work with avconv

:ci skip all
2021-01-26 23:27:32 +05:30
pukkandan
5bfa486205 Add option --parse-metadata
* The fields extracted by this can be used in `--output`
* Deprecated `--metadata-from-title`

:ci skip dl
2021-01-26 16:14:31 +05:30
pukkandan
9882064024 [movefiles] Don't give "cant find" warning when move is unnecessary 2021-01-26 15:53:32 +05:30
pukkandan
2d6921210d [postprocessor] fix write_debug when no _downloader 2021-01-26 15:53:22 +05:30
pukkandan
f137c99e9f Fix some fields not sorting correctly
bug introduced by: 63be1aab2f
2021-01-25 19:28:39 +05:30
pukkandan
6b8eb0c024 Report error message from youtube as error (Closes #33)
:ci skip dl
2021-01-25 10:26:51 +05:30
pukkandan
5b328c97d7 Changed revision number to use '.' instead of '-'
and refactor it

:ci skip dl
2021-01-25 02:25:05 +05:30
pukkandan
b5d265633d Fix wrong user config (Closes #32)
:ci skip dl
2021-01-25 01:52:47 +05:30
pukkandan
a392adf56c [version] update
:ci skip dl
2021-01-24 21:51:50 +05:30
pukkandan
0bc0a32290 Release 2021.01.24 2021-01-24 21:39:55 +05:30
Remita Amine
a820dc722e Update to ytdl-2021.01.24.1 2021-01-24 20:28:44 +05:30
pukkandan
f74980cbae Plugin support
Extractor plugins are loaded from <root-dir>/ytdlp_plugins/extractor/__init__.py

Inspired by https://github.com/un-def/dl-plus

:ci skip dl
2021-01-24 20:24:07 +05:30
pukkandan
c571435f9c [MoveFiles] More robust way to get final filename
:ci skip dl
2021-01-24 20:24:06 +05:30
pukkandan
6b4b65c4f4 [test] fix typo 2021-01-24 14:05:54 +05:30
pukkandan
10e3742eb1 Fix overwrite in --write-link
:ci skip dl
2021-01-24 14:05:32 +05:30
pukkandan
0202b52a0c #29 New option -P/--paths to give different paths for different types of files
Syntax: `-P "type:path" -P "type:path"`
Types: home, temp, description, annotation, subtitle, infojson, thumbnail
2021-01-23 17:53:17 +05:30
pukkandan
b8f6bbe68a Warn when using old style (downloader/postprocessor)_args 2021-01-23 17:41:21 +05:30
pukkandan
256ed01025 [sponskrub] Print "unrecognized args" message correctly 2021-01-23 17:17:47 +05:30
pukkandan
eab9b2bcaf Modified function cli_configuration_args
to directly parse new format of `postprocessor_args` and `external_downloader_args`
2021-01-23 17:00:11 +05:30
pukkandan
3bcaa37b1b [tests] Split core and download tests 2021-01-23 17:00:11 +05:30
pukkandan
46ee996e39 Allow passing different arguments to different external downloaders
* Now similar to --post-processor-args
* Also added `--downloader-args` as alias to `--external-downloader-args`
2021-01-23 17:00:10 +05:30
pukkandan
45016689fa Standardized function for creating dict from repeated options 2021-01-23 17:00:10 +05:30
pukkandan
430c2757ea [cbs] Make failure to extract title non-fatal
:skip ci
2021-01-23 08:51:57 +05:30
The Hatsune Daishi
ffcb819171 #30 [mildom] Add extractor
Authored by @nao20010128nao
2021-01-22 19:13:30 +05:30
pukkandan
b46696bdc8 Revert d9eebbc747 2021-01-22 01:09:24 +05:30
pukkandan
63be1aab2f Deprecate unnecessary aliases in formatSort
(I should never have made so many aliases in the first-place)
The aliases remain functional for backward compatability, but will be left undocumented
2021-01-21 19:05:57 +05:30
pukkandan
d0757229fa Fix typecasting when pre-checking archive (Closes #26) 2021-01-21 17:36:42 +05:30
pukkandan
610d8e7692 [tests] Fix test_post_hooks
:skip ci all
2021-01-21 03:38:57 +05:30
pukkandan
e2f6586c16 [version] update
:skip ci all
2021-01-21 03:01:26 +05:30
100 changed files with 4817 additions and 3626 deletions

View File

@@ -21,7 +21,7 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.16. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.29. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/pukkandan/yt-dlp.
- Search the bugtracker for similar issues: https://github.com/pukkandan/yt-dlp. DO NOT post duplicates.
@@ -29,7 +29,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running yt-dlp version **2021.01.16**
- [ ] I've verified that I'm running yt-dlp version **2021.01.29**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
@@ -44,7 +44,7 @@ Add the `-v` flag to your command line you run youtube-dlc with (`youtube-dlc -v
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] yt-dlp version 2021.01.16
[debug] yt-dlp version 2021.01.29
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -21,7 +21,7 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.16. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.29. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://github.com/pukkandan/yt-dlp. yt-dlp does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: https://github.com/pukkandan/yt-dlp. DO NOT post duplicates.
@@ -29,7 +29,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running yt-dlp version **2021.01.16**
- [ ] I've verified that I'm running yt-dlp version **2021.01.29**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones

View File

@@ -21,13 +21,13 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.16. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.29. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: https://github.com/pukkandan/yt-dlp. DO NOT post duplicates.
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running yt-dlp version **2021.01.16**
- [ ] I've verified that I'm running yt-dlp version **2021.01.29**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones

View File

@@ -21,7 +21,7 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.16. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.29. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/pukkandan/yt-dlp.
- Search the bugtracker for similar issues: https://github.com/pukkandan/yt-dlp. DO NOT post duplicates.
@@ -30,7 +30,7 @@ Carefully read and work through this check list in order to prevent the most com
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running yt-dlp version **2021.01.16**
- [ ] I've verified that I'm running yt-dlp version **2021.01.29**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
@@ -46,7 +46,7 @@ Add the `-v` flag to your command line you run youtube-dlc with (`youtube-dlc -v
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] yt-dlp version 2021.01.16
[debug] yt-dlp version 2021.01.29
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -21,13 +21,13 @@ assignees: ''
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.16. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- First of, make sure you are using the latest version of yt-dlp. Run `youtube-dlc --version` and ensure your version is 2021.01.29. If it's not, see https://github.com/pukkandan/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: https://github.com/pukkandan/yt-dlp. DO NOT post duplicates.
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running yt-dlp version **2021.01.16**
- [ ] I've verified that I'm running yt-dlp version **2021.01.29**
- [ ] I've searched the bugtracker for similar feature requests including closed ones

View File

@@ -25,8 +25,8 @@ jobs:
run: sudo apt-get -y install zip pandoc man
- name: Bump version
id: bump_version
run: python scripts/update-version-workflow.py
- name: Check the output from My action
run: python devscripts/update-version.py
- name: Print version
run: echo "${{ steps.bump_version.outputs.ytdlc_version }}"
- name: Run Make
run: make
@@ -84,11 +84,14 @@ jobs:
with:
python-version: '3.8'
- name: Install Requirements
run: pip install pyinstaller
run: pip install pyinstaller mutagen
- name: Bump version
run: python scripts/update-version-workflow.py
id: bump_version
run: python devscripts/update-version.py
- name: Print version
run: echo "${{ steps.bump_version.outputs.ytdlc_version }}"
- name: Run PyInstaller Script
run: python pyinst.py
run: python devscripts/pyinst.py 64
- name: Upload youtube-dlc.exe Windows binary
id: upload-release-windows
uses: actions/upload-release-asset@v1
@@ -119,11 +122,14 @@ jobs:
python-version: '3.4.4'
architecture: 'x86'
- name: Install Requirements for 32 Bit
run: pip install pyinstaller==3.5
run: pip install pyinstaller==3.5 mutagen
- name: Bump version
run: python scripts/update-version-workflow.py
id: bump_version
run: python devscripts/update-version.py
- name: Print version
run: echo "${{ steps.bump_version.outputs.ytdlc_version }}"
- name: Run PyInstaller Script for 32 Bit
run: python pyinst32.py
run: python devscripts/pyinst.py 32
- name: Upload Executable youtube-dlc_x86.exe
id: upload-release-windows32
uses: actions/upload-release-asset@v1
@@ -162,18 +168,15 @@ jobs:
asset_name: SHA2-256SUMS
asset_content_type: text/plain
update_version_badge:
runs-on: ubuntu-latest
needs: build_unix
steps:
- name: Create Version Badge
uses: schneegans/dynamic-badges-action@v1.0.0
with:
auth: ${{ secrets.GIST_TOKEN }}
gistID: c69cb23c3c5b3316248e52022790aa57
filename: version.json
label: Version
message: ${{ needs.build_unix.outputs.ytdlc_version }}
# update_version_badge:
# runs-on: ubuntu-latest
# needs: build_unix
# steps:
# - name: Create Version Badge
# uses: schneegans/dynamic-badges-action@v1.0.0
# with:
# auth: ${{ secrets.GIST_TOKEN }}
# gistID: c69cb23c3c5b3316248e52022790aa57
# filename: version.json
# label: Version
# message: ${{ needs.build_unix.outputs.ytdlc_version }}

View File

@@ -1,9 +1,9 @@
name: Full Test
name: Core Tests
on: [push, pull_request]
jobs:
tests:
name: Tests
if: "!contains(github.event.head_commit.message, 'skip ci')"
name: Core Tests
if: "!contains(github.event.head_commit.message, 'ci skip all')"
runs-on: ${{ matrix.os }}
strategy:
fail-fast: true
@@ -12,7 +12,7 @@ jobs:
# TODO: python 2.6
python-version: [2.7, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, pypy-2.7, pypy-3.6, pypy-3.7]
python-impl: [cpython]
ytdl-test-set: [core, download]
ytdl-test-set: [core]
run-tests-ext: [sh]
include:
# python 3.2 is only available on windows via setup-python
@@ -21,20 +21,11 @@ jobs:
python-impl: cpython
ytdl-test-set: core
run-tests-ext: bat
- os: windows-latest
python-version: 3.2
python-impl: cpython
ytdl-test-set: download
run-tests-ext: bat
# jython
- os: ubuntu-latest
python-impl: jython
ytdl-test-set: core
run-tests-ext: sh
- os: ubuntu-latest
python-impl: jython
ytdl-test-set: download
run-tests-ext: sh
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
@@ -60,4 +51,4 @@ jobs:
env:
YTDL_TEST_SET: ${{ matrix.ytdl-test-set }}
run: ./devscripts/run_tests.${{ matrix.run-tests-ext }}
# flake8 has been moved to quick-test
# Linter is in quick-test

53
.github/workflows/download.yml vendored Normal file
View File

@@ -0,0 +1,53 @@
name: Download Tests
on: [push, pull_request]
jobs:
tests:
name: Download Tests
if: "!contains(github.event.head_commit.message, 'ci skip dl') && !contains(github.event.head_commit.message, 'ci skip all')"
runs-on: ${{ matrix.os }}
strategy:
fail-fast: true
matrix:
os: [ubuntu-18.04]
# TODO: python 2.6
python-version: [2.7, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, pypy-2.7, pypy-3.6, pypy-3.7]
python-impl: [cpython]
ytdl-test-set: [download]
run-tests-ext: [sh]
include:
# python 3.2 is only available on windows via setup-python
- os: windows-latest
python-version: 3.2
python-impl: cpython
ytdl-test-set: download
run-tests-ext: bat
# jython - disable for now since it takes too long to complete
# - os: ubuntu-latest
# python-impl: jython
# ytdl-test-set: download
# run-tests-ext: sh
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
if: ${{ matrix.python-impl == 'cpython' }}
with:
python-version: ${{ matrix.python-version }}
- name: Set up Java 8
if: ${{ matrix.python-impl == 'jython' }}
uses: actions/setup-java@v1
with:
java-version: 8
- name: Install Jython
if: ${{ matrix.python-impl == 'jython' }}
run: |
wget http://search.maven.org/remotecontent?filepath=org/python/jython-installer/2.7.1/jython-installer-2.7.1.jar -O jython-installer.jar
java -jar jython-installer.jar -s -d "$HOME/jython"
echo "$HOME/jython/bin" >> $GITHUB_PATH
- name: Install nose
run: pip install nose
- name: Run tests
continue-on-error: ${{ matrix.ytdl-test-set == 'download' || matrix.python-impl == 'jython' }}
env:
YTDL_TEST_SET: ${{ matrix.ytdl-test-set }}
run: ./devscripts/run_tests.${{ matrix.run-tests-ext }}

View File

@@ -1,13 +1,13 @@
name: Core Test
name: Quick Test
on: [push, pull_request]
jobs:
tests:
name: Core Tests
if: "!contains(github.event.head_commit.message, 'skip ci all')"
name: Core Test
if: "!contains(github.event.head_commit.message, 'ci skip all')"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.9
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.9
@@ -19,7 +19,7 @@ jobs:
run: ./devscripts/run_tests.sh
flake8:
name: Linter
if: "!contains(github.event.head_commit.message, 'skip ci all')"
if: "!contains(github.event.head_commit.message, 'ci skip all')"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2

92
.gitignore vendored
View File

@@ -1,35 +1,43 @@
# Python
*.pyc
*.pyo
*.class
*~
*.DS_Store
wine-py2exe/
py2exe.log
*.kate-swp
build/
dist/
zip/
tmp/
venv/
# Misc
*~
*.DS_Store
*.kate-swp
MANIFEST
README.txt
youtube-dl.1
youtube-dlc.1
youtube-dl.bash-completion
youtube-dlc.bash-completion
youtube-dl.fish
youtube-dlc.fish
youtube_dl/extractor/lazy_extractors.py
youtube_dlc/extractor/lazy_extractors.py
youtube-dl
youtube-dlc
youtube-dl.exe
youtube-dlc.exe
youtube-dl.tar.gz
youtube-dlc.tar.gz
youtube-dlc.spec
test/local_parameters.json
.coverage
cover/
updates_key.pem
*.egg-info
.tox
*.class
# Generated
README.txt
*.1
*.bash-completion
*.fish
*.exe
*.tar.gz
*.zsh
*.spec
# Binary
youtube-dl
youtube-dlc
*.exe
# Downloaded
*.srt
*.ttml
*.sbv
@@ -46,25 +54,33 @@ updates_key.pem
*.swf
*.part
*.ytdl
*.conf
*.swp
*.ogg
*.opus
*.info.json
*.annotations.xml
*.description
# Config
*.conf
*.spec
*.exe
test/local_parameters.json
.tox
youtube-dl.zsh
youtube-dlc.zsh
# IntelliJ related files
.idea
*.iml
tmp/
venv/
# VS Code related files
.vscode
cookies
cookies.txt
*.sublime-workspace
# Text Editor / IDE
.idea
*.iml
.vscode
*.sublime-workspace
*.sublime-project
!yt-dlp.sublime-project
# Lazy extractors
*/extractor/lazy_extractors.py
# Plugins
ytdlp_plugins/extractor/*
!ytdlp_plugins/extractor/__init__.py
!ytdlp_plugins/extractor/sample.py

248
AUTHORS
View File

@@ -1,248 +0,0 @@
Ricardo Garcia Gonzalez
Danny Colligan
Benjamin Johnson
Vasyl' Vavrychuk
Witold Baryluk
Paweł Paprota
Gergely Imreh
Rogério Brito
Philipp Hagemeister
Sören Schulze
Kevin Ngo
Ori Avtalion
shizeeg
Filippo Valsorda
Christian Albrecht
Dave Vasilevsky
Jaime Marquínez Ferrándiz
Jeff Crouse
Osama Khalid
Michael Walter
M. Yasoob Ullah Khalid
Julien Fraichard
Johny Mo Swag
Axel Noack
Albert Kim
Pierre Rudloff
Huarong Huo
Ismael Mejía
Steffan Donal
Andras Elso
Jelle van der Waa
Marcin Cieślak
Anton Larionov
Takuya Tsuchida
Sergey M.
Michael Orlitzky
Chris Gahan
Saimadhav Heblikar
Mike Col
Oleg Prutz
pulpe
Andreas Schmitz
Michael Kaiser
Niklas Laxström
David Triendl
Anthony Weems
David Wagner
Juan C. Olivares
Mattias Harrysson
phaer
Sainyam Kapoor
Nicolas Évrard
Jason Normore
Hoje Lee
Adam Thalhammer
Georg Jähnig
Ralf Haring
Koki Takahashi
Ariset Llerena
Adam Malcontenti-Wilson
Tobias Bell
Naglis Jonaitis
Charles Chen
Hassaan Ali
Dobrosław Żybort
David Fabijan
Sebastian Haas
Alexander Kirk
Erik Johnson
Keith Beckman
Ole Ernst
Aaron McDaniel (mcd1992)
Magnus Kolstad
Hari Padmanaban
Carlos Ramos
5moufl
lenaten
Dennis Scheiba
Damon Timm
winwon
Xavier Beynon
Gabriel Schubiner
xantares
Jan Matějka
Mauroy Sébastien
William Sewell
Dao Hoang Son
Oskar Jauch
Matthew Rayfield
t0mm0
Tithen-Firion
Zack Fernandes
cryptonaut
Adrian Kretz
Mathias Rav
Petr Kutalek
Will Glynn
Max Reimann
Cédric Luthi
Thijs Vermeir
Joel Leclerc
Christopher Krooss
Ondřej Caletka
Dinesh S
Johan K. Jensen
Yen Chi Hsuan
Enam Mijbah Noor
David Luhmer
Shaya Goldberg
Paul Hartmann
Frans de Jonge
Robin de Rooij
Ryan Schmidt
Leslie P. Polzer
Duncan Keall
Alexander Mamay
Devin J. Pohly
Eduardo Ferro Aldama
Jeff Buchbinder
Amish Bhadeshia
Joram Schrijver
Will W.
Mohammad Teimori Pabandi
Roman Le Négrate
Matthias Küch
Julian Richen
Ping O.
Mister Hat
Peter Ding
jackyzy823
George Brighton
Remita Amine
Aurélio A. Heckert
Bernhard Minks
sceext
Zach Bruggeman
Tjark Saul
slangangular
Behrouz Abbasi
ngld
nyuszika7h
Shaun Walbridge
Lee Jenkins
Anssi Hannula
Lukáš Lalinský
Qijiang Fan
Rémy Léone
Marco Ferragina
reiv
Muratcan Simsek
Evan Lu
flatgreen
Brian Foley
Vignesh Venkat
Tom Gijselinck
Founder Fang
Andrew Alexeyew
Saso Bezlaj
Erwin de Haan
Jens Wille
Robin Houtevelts
Patrick Griffis
Aidan Rowe
mutantmonkey
Ben Congdon
Kacper Michajłow
José Joaquín Atria
Viťas Strádal
Kagami Hiiragi
Philip Huppert
blahgeek
Kevin Deldycke
inondle
Tomáš Čech
Déstin Reed
Roman Tsiupa
Artur Krysiak
Jakub Adam Wieczorek
Aleksandar Topuzović
Nehal Patel
Rob van Bekkum
Petr Zvoníček
Pratyush Singh
Aleksander Nitecki
Sebastian Blunt
Matěj Cepl
Xie Yanbo
Philip Xu
John Hawkinson
Rich Leeper
Zhong Jianxin
Thor77
Mattias Wadman
Arjan Verwer
Costy Petrisor
Logan B
Alex Seiler
Vijay Singh
Paul Hartmann
Stephen Chen
Fabian Stahl
Bagira
Odd Stråbø
Philip Herzog
Thomas Christlieb
Marek Rusinowski
Tobias Gruetzmacher
Olivier Bilodeau
Lars Vierbergen
Juanjo Benages
Xiao Di Guan
Thomas Winant
Daniel Twardowski
Jeremie Jarosh
Gerard Rovira
Marvin Ewald
Frédéric Bournival
Timendum
gritstub
Adam Voss
Mike Fährmann
Jan Kundrát
Giuseppe Fabiano
Örn Guðjónsson
Parmjit Virk
Genki Sky
Ľuboš Katrinec
Corey Nicholson
Ashutosh Chaudhary
John Dong
Tatsuyuki Ishi
Daniel Weber
Kay Bouché
Yang Hongbo
Lei Wang
Petr Novák
Leonardo Taccari
Martin Weinelt
Surya Oktafendri
TingPing
Alexandre Macabies
Bastian de Groot
Niklas Haas
András Veres-Szentkirályi
Enes Solak
Nathan Rossi
Thomas van der Berg
Luca Cherubin

View File

@@ -15,4 +15,5 @@ ohnonot
samiksome
alxnull
FelixFrog
Zocker1999NET
Zocker1999NET
nao20010128nao

View File

@@ -4,25 +4,101 @@
# Instuctions for creating release
* Run `make doc`
* Update Changelog.md and Authors-Fork
* Update Changelog.md and CONTRIBUTORS
* Change "Merged with youtube-dl" version in Readme.md if needed
* Commit to master as `Release <version>`
* Push to origin/release - build task will now run
* Update version.py and run `make issuetemplates`
* Commit to master as `[version] update`
* Update version.py using devscripts\update-version.py (be wary of timezones)
* Run `make issuetemplates`
* Commit to master as `[version] update :ci skip all`
* Push to origin/master
* Update changelog in /releases
-->
### 2021.02.04
* **Merge youtube-dl:** Upto [2021.02.04.1](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.02.04.1)
* **Date/time formatting in output template:** You can now use [`strftime`](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) to format date/time fields. Example: `%(upload_date>%Y-%m-%d)s`
* **Multiple output templates:**
* Seperate output templates can be given for the different metadata files by using `-o TYPE:TEMPLATE`
* The alowed types are: `subtitle|thumbnail|description|annotation|infojson|pl_description|pl_infojson`
* [youtube] More metadata extraction for channel/playlist URLs (channel, uploader, thumbnail, tags)
* New option `--no-write-playlist-metafiles` to prevent writing playlist metadata files
* [audius] Fix extractor
* [youtube_live_chat] Fix `parse_yt_initial_data` and add `fragment_retries`
* [postprocessor] Raise errors correctly
* [metadatafromtitle] Fix bug when extracting data from numeric fields
* Fix issue with overwriting files
* Fix "Default format spec" appearing in quiet mode
* [FormatSort] Allow user to prefer av01 over vp9 (The default is still vp9)
* [FormatSort] fix bug where `quality` had more priority than `hasvid`
* [pyinst] Automatically detect python architecture and working directory
* Strip out internal fields such as `_filename` from infojson
### 2021.01.29
* **Features from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl)**: Co-authored by @animelover1984 and @bbepis
* Add `--get-comments`
* [youtube] Extract comments
* [billibilli] Added BiliBiliSearchIE, BilibiliChannelIE
* [billibilli] Extract comments
* [billibilli] Better video extraction
* Write playlist data to infojson
* [FFmpegMetadata] Embed infojson inside the video
* [EmbedThumbnail] Try embedding in mp4 using ffprobe and `-disposition`
* [EmbedThumbnail] Treat mka like mkv and mov like mp4
* [EmbedThumbnail] Embed in ogg/opus
* [VideoRemuxer] Conditionally remux video
* [VideoRemuxer] Add `-movflags +faststart` when remuxing to mp4
* [ffmpeg] Print entire stderr in verbose when there is error
* [EmbedSubtitle] Warn when embedding ass in mp4
* [anvato] Use NFLTokenGenerator if possible
* **Parse additional metadata**: New option `--parse-metadata` to extract additional metadata from existing fields
* The extracted fields can be used in `--output`
* Deprecated `--metadata-from-title`
* [Audius] Add extractor
* [youtube] Extract playlist description and write it to `.description` file
* Detect existing files even when using `recode`/`remux` (`extract-audio` is partially fixed)
* Fix wrong user config from v2021.01.24
* [youtube] Report error message from youtube as error instead of warning
* [FormatSort] Fix some fields not sorting from v2021.01.24
* [postprocessor] Deprecate `avconv`/`avprobe`. All current functionality is left untouched. But don't expect any new features to work with avconv
* [postprocessor] fix `write_debug` to not throw error when there is no `_downloader`
* [movefiles] Don't give "cant find" warning when move is unnecessary
* Refactor `update-version`, `pyinst.py` and related files
* [ffmpeg] Document more formats that are supported for remux/recode
### 2021.01.24
* **Merge youtube-dl:** Upto [2021.01.24](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.01.16)
* Plugin support ([documentation](https://github.com/pukkandan/yt-dlp#plugins))
* **Multiple paths**: New option `-P`/`--paths` to give different paths for different types of files
* The syntax is `-P "type:path" -P "type:path"` ([documentation](https://github.com/pukkandan/yt-dlp#:~:text=-P,%20--paths%20TYPE:PATH))
* Valid types are: home, temp, description, annotation, subtitle, infojson, thumbnail
* Additionally, configuration file is taken from home directory or current directory ([documentation](https://github.com/pukkandan/yt-dlp#:~:text=Home%20Configuration))
* Allow passing different arguments to different external downloaders ([documentation](https://github.com/pukkandan/yt-dlp#:~:text=--downloader-args%20NAME:ARGS))
* [mildom] Add extractor by @nao20010128nao
* Warn when using old style `--external-downloader-args` and `--post-processor-args`
* Fix `--no-overwrite` when using `--write-link`
* [sponskrub] Output `unrecognized argument` error message correctly
* [cbs] Make failure to extract title non-fatal
* Fix typecasting when pre-checking archive
* Fix issue with setting title on UNIX
* Deprecate redundant aliases in `formatSort`. The aliases remain functional for backward compatibility, but will be left undocumented
* [tests] Fix test_post_hooks
* [tests] Split core and download tests
### 2021.01.20
* [TrovoLive] Add extractor (only VODs)
* [pokemon] Add `/#/player` URLs (Closes #24)
* [pokemon] Add `/#/player` URLs
* Improved parsing of multiple postprocessor-args, add `--ppa` as alias
* [EmbedThumbnail] Simplify embedding in mkv
* [sponskrub] Encode filenames correctly, better debug output and error message
* [readme] Cleanup options
### 2021.01.16
* **Merge youtube-dl:** Upto [2021.01.16](https://github.com/ytdl-org/youtube-dl/releases/tag/2021.01.16)
* **Configuration files:**
@@ -118,7 +194,7 @@
* Added `--no-ignore-dynamic-mpd`, `--no-allow-dynamic-mpd`, `--allow-dynamic-mpd`, `--youtube-include-hls-manifest`, `--no-youtube-include-hls-manifest`, `--no-youtube-skip-hls-manifest`, `--no-download`, `--no-download-archive`, `--resize-buffer`, `--part`, `--mtime`, `--no-keep-fragments`, `--no-cookies`, `--no-write-annotations`, `--no-write-info-json`, `--no-write-description`, `--no-write-thumbnail`, `--youtube-include-dash-manifest`, `--post-overwrites`, `--no-keep-video`, `--no-embed-subs`, `--no-embed-thumbnail`, `--no-add-metadata`, `--no-include-ads`, `--no-write-sub`, `--no-write-auto-sub`, `--no-playlist-reverse`, `--no-restrict-filenames`, `--youtube-include-dash-manifest`, `--no-format-sort-force`, `--flat-videos`, `--no-list-formats-as-table`, `--no-sponskrub`, `--no-sponskrub-cut`, `--no-sponskrub-force`
* Renamed: `--write-subs`, `--no-write-subs`, `--no-write-auto-subs`, `--write-auto-subs`. Note that these can still be used without the ending "s"
* Relaxed validation for format filters so that any arbitrary field can be used
* Fix for embedding thumbnail in mp3 by @pauldubois98
* Fix for embedding thumbnail in mp3 by @pauldubois98 ([ytdl-org/youtube-dl#21569](https://github.com/ytdl-org/youtube-dl/pull/21569))
* Make Twitch Video ID output from Playlist and VOD extractor same. This is only a temporary fix
* **Merge youtube-dl:** Upto [2021.01.03](https://github.com/ytdl-org/youtube-dl/commit/8e953dcbb10a1a42f4e12e4e132657cb0100a1f8) - See [blackjack4494/yt-dlc#280](https://github.com/blackjack4494/yt-dlc/pull/280) for details
* Extractors [tiktok](https://github.com/ytdl-org/youtube-dl/commit/fb626c05867deab04425bad0c0b16b55473841a2) and [hotstar](https://github.com/ytdl-org/youtube-dl/commit/bb38a1215718cdf36d73ff0a7830a64cd9fa37cc) have not been merged

274
README.md
View File

@@ -1,10 +1,12 @@
# YT-DLP
<!-- See: https://github.com/marketplace/actions/dynamic-badges -->
[![Release Version](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/pukkandan/c69cb23c3c5b3316248e52022790aa57/raw/version.json&color=brightgreen)](https://github.com/pukkandan/yt-dlp/releases/latest)
[![License: Unlicense](https://img.shields.io/badge/License-Unlicense-blue.svg)](https://github.com/pukkandan/yt-dlp/blob/master/LICENSE)
[![Core Status](https://github.com/pukkandan/yt-dlp/workflows/Core%20Test/badge.svg?branch=master)](https://github.com/pukkandan/yt-dlp/actions?query=workflow%3ACore)
[![CI Status](https://github.com/pukkandan/yt-dlp/workflows/Full%20Test/badge.svg?branch=master)](https://github.com/pukkandan/yt-dlp/actions?query=workflow%3AFull)
[![Release version](https://img.shields.io/github/v/release/pukkandan/yt-dlp?color=brightgreen&label=Release)](https://github.com/pukkandan/yt-dlp/releases/latest)
[![License: Unlicense](https://img.shields.io/badge/License-Unlicense-blue.svg)](LICENSE)
[![CI Status](https://github.com/pukkandan/yt-dlp/workflows/Core%20Tests/badge.svg?branch=master)](https://github.com/pukkandan/yt-dlp/actions)
[![Commits](https://img.shields.io/github/commit-activity/m/pukkandan/yt-dlp?label=commits)](https://github.com/pukkandan/yt-dlp/commits)
[![Last Commit](https://img.shields.io/github/last-commit/pukkandan/yt-dlp/master)](https://github.com/pukkandan/yt-dlp/commits)
[![Downloads](https://img.shields.io/github/downloads/pukkandan/yt-dlp/total)](https://github.com/pukkandan/yt-dlp/releases/latest)
[![PyPi Downloads](https://img.shields.io/pypi/dm/yt-dlp?label=PyPi)](https://pypi.org/project/yt-dlp)
A command-line program to download videos from youtube.com and many other [video platforms](docs/supportedsites.md)
@@ -41,6 +43,7 @@ This is a fork of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) which i
* [Filtering Formats](#filtering-formats)
* [Sorting Formats](#sorting-formats)
* [Format Selection examples](#format-selection-examples)
* [PLUGINS](#plugins)
* [MORE](#more)
@@ -51,20 +54,31 @@ The major new features from the latest release of [blackjack4494/yt-dlc](https:/
* **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection that what is possible by simply using `--format` ([examples](#format-selection-examples))
* **Merged with youtube-dl v2021.01.16**: You get all the latest features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494/yt-dlc)
* **Merged with youtube-dl v2021.02.04.1**: You get all the latest features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494/yt-dlc)
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--get-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, Playlist infojson etc. Note that the NicoNico improvements are not available. See [#31](https://github.com/pukkandan/yt-dlp/pull/31) for details.
* **Youtube improvements**:
* All Youtube Feeds (`:ytfav`, `:ytwatchlater`, `:ytsubs`, `:ythistory`, `:ytrec`) works correctly and support downloading multiple pages of content
* Youtube search works correctly (`ytsearch:`, `ytsearchdate:`) along with Search URLs
* Redirect channel's home URL automatically to `/video` to preserve the old behaviour
* **New extractors**: Trovo.live, AnimeLab, Philo MSO, Rcs, Gedi, bitwave.tv
* **New extractors**: AnimeLab, Philo MSO, Rcs, Gedi, bitwave.tv, mildom, audius
* **Fixed extractors**: archive.org, roosterteeth.com, skyit, instagram, itv, SouthparkDe, spreaker, Vlive, tiktok, akamai, ina
* **New options**: `--list-formats-as-table`, `--write-link`, `--force-download-archive`, `--force-overwrites`, `--break-on-reject` etc
* **Plugin support**: Extractors can be loaded from an external file. See [plugins](#plugins) for details
* **Multiple paths and output templates**: You can give different [output templates](#output-template) and download paths for different types of files. You can also set a temporary path where intermediary files are downloaded to. See [`--paths`](https://github.com/pukkandan/yt-dlp/#:~:text=-P,%20--paths%20TYPE:PATH) for details
<!-- Relative link doesn't work for "#:~:text=" -->
* **Portable Configuration**: Configuration files are automatically loaded from the home and root directories. See [configuration](#configuration) for details
* **Other new options**: `--parse-metadata`, `--list-formats-as-table`, `--write-link`, `--force-download-archive`, `--force-overwrites`, `--break-on-reject` etc
* **Improvements**: Multiple `--postprocessor-args` and `--external-downloader-args`, Date/time formatting in `-o`, faster archive checking, more [format selection options](#format-selection) etc
* **Improvements**: Multiple `--postprocessor-args`, `%(duration_string)s` in `-o`, faster archive checking, more [format selection options](#format-selection) etc
See [changelog](Changelog.md) or [commits](https://github.com/pukkandan/yt-dlp/commits) for the full list of changes
@@ -77,7 +91,7 @@ If you are coming from [youtube-dl](https://github.com/ytdl-org/youtube-dl), the
# INSTALLATION
You can install yt-dlp using one of the following methods:
* Use [PyPI package](https://pypi.org/project/yt-dlp/): `python -m pip install --upgrade yt-dlp`
* Use [PyPI package](https://pypi.org/project/yt-dlp): `python -m pip install --upgrade yt-dlp`
* Download the binary from the [latest release](https://github.com/pukkandan/yt-dlp/releases/latest)
* Use pip+git: `python -m pip install --upgrade git+https://github.com/pukkandan/yt-dlp.git@release`
* Install master branch: `python -m pip install --upgrade git+https://github.com/pukkandan/yt-dlp`
@@ -88,16 +102,15 @@ You can install yt-dlp using one of the following methods:
### COMPILE
**For Windows**:
To build the Windows executable yourself (without version info!)
To build the Windows executable, you must have pyinstaller (and optionally mutagen for embedding thumbnail in opus/ogg files)
python -m pip install --upgrade pyinstaller mutagen
Once you have all the necessary dependancies installed, just run `py devscripts\pyinst.py`. The executable will be built for the same architecture (32/64 bit) as the python used to build it. It is strongly reccomended to use python3 although python2.6+ is supported.
You can also build the executable without any version info or metadata by using:
python -m pip install --upgrade pyinstaller
pyinstaller.exe youtube_dlc\__main__.py --onefile --name youtube-dlc
Or simply execute the `make_win.bat` if pyinstaller is installed.
There will be a `youtube-dlc.exe` in `/dist`
New way to build Windows is to use `python pyinst.py` (please use python3 64Bit)
For 32Bit Version use a 32Bit Version of python (3 preferred here as well) and run `python pyinst32.py`
**For Unix**:
You will need the required build tools
@@ -106,6 +119,7 @@ Then simply type this
make
**Note**: In either platform, `devscripts\update-version.py` can be used to automatically update the version number
# DESCRIPTION
**youtube-dlc** is a command-line program to download videos from youtube.com many other [video platforms](docs/supportedsites.md). It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
@@ -151,9 +165,9 @@ Then simply type this
compatibility) if this option is found
inside the system configuration file, the
user configuration is not loaded
--config-location PATH Location of the configuration file; either
the path to the config or its containing
directory
--config-location PATH Location of the main configuration file;
either the path to the config or its
containing directory
--flat-playlist Do not extract the videos of a playlist,
only list them
--flat-videos Do not resolve the video urls
@@ -303,19 +317,36 @@ Then simply type this
allowing to play the video while
downloading (some players may not be able
to play it)
--external-downloader COMMAND Use the specified external downloader.
Currently supports
aria2c,avconv,axel,curl,ffmpeg,httpie,wget
--external-downloader-args ARGS Give these arguments to the external
downloader
--external-downloader NAME Use the specified external downloader.
Currently supports aria2c, avconv, axel,
curl, ffmpeg, httpie, wget
--downloader-args NAME:ARGS Give these arguments to the external
downloader. Specify the downloader name and
the arguments separated by a colon ":". You
can use this option multiple times
(Alias: --external-downloader-args)
## Filesystem Options:
-a, --batch-file FILE File containing URLs to download ('-' for
stdin), one URL per line. Lines starting
with '#', ';' or ']' are considered as
comments and ignored
-o, --output TEMPLATE Output filename template, see "OUTPUT
-P, --paths TYPE:PATH The paths where the files should be
downloaded. Specify the type of file and
the path separated by a colon ":". All the
same types as --output are supported.
Additionally, you can also provide "home"
and "temp" paths. All intermediary files
are first downloaded to the temp path and
then the final files are moved over to the
home path after download is finished. This
option is ignored if --output is an
absolute path
-o, --output [TYPE:]TEMPLATE Output filename template, see "OUTPUT
TEMPLATE" for details
--output-na-placeholder TEXT Placeholder value for unavailable meta
fields in output filename template
(default: "NA")
--autonumber-start NUMBER Specify the start value for %(autonumber)s
(default is 1)
--restrict-filenames Restrict filenames to only ASCII
@@ -328,9 +359,11 @@ Then simply type this
This option includes --no-continue
--no-force-overwrites Do not overwrite the video, but overwrite
related files (default)
-c, --continue Resume partially downloaded files (default)
--no-continue Restart download of partially downloaded
files from beginning
-c, --continue Resume partially downloaded files/fragments
(default)
--no-continue Do not resume partially downloaded
fragments. If the file is unfragmented,
restart download of the entire file
--part Use .part files instead of writing directly
into output file (default)
--no-part Do not use .part files - write directly
@@ -343,10 +376,18 @@ Then simply type this
file
--no-write-description Do not write video description (default)
--write-info-json Write video metadata to a .info.json file
(this may contain personal information)
--no-write-info-json Do not write video metadata (default)
--write-annotations Write video annotations to a
.annotations.xml file
--no-write-annotations Do not write video annotations (default)
--write-playlist-metafiles Write playlist metadata in addition to the
video metadata when using --write-info-json,
--write-description etc. (default)
--no-write-playlist-metafiles Do not write playlist metadata when using
--write-info-json, --write-description etc.
--get-comments Retrieve video comments to be placed in the
.info.json file
--load-info-json FILE JSON file containing the video information
(created with the "--write-info-json"
option)
@@ -435,7 +476,7 @@ Then simply type this
--referer URL Specify a custom referer, use if the video
access is restricted to one domain
--add-header FIELD:VALUE Specify a custom HTTP header and its value,
separated by a colon ':'. You can use this
separated by a colon ":". You can use this
option multiple times
--bidi-workaround Work around terminals that lack
bidirectional text support. Requires bidiv
@@ -481,17 +522,17 @@ Then simply type this
--list-formats-old Present the output of -F in the old form
(Alias: --no-list-formats-as-table)
--youtube-include-dash-manifest Download the DASH manifests and related
data on YouTube videos (default) (Alias:
--no-youtube-skip-dash-manifest)
data on YouTube videos (default)
(Alias: --no-youtube-skip-dash-manifest)
--youtube-skip-dash-manifest Do not download the DASH manifests and
related data on YouTube videos (Alias:
--no-youtube-include-dash-manifest)
related data on YouTube videos
(Alias: --no-youtube-include-dash-manifest)
--youtube-include-hls-manifest Download the HLS manifests and related data
on YouTube videos (default) (Alias:
--no-youtube-skip-hls-manifest)
on YouTube videos (default)
(Alias: --no-youtube-skip-hls-manifest)
--youtube-skip-hls-manifest Do not download the HLS manifests and
related data on YouTube videos (Alias:
--no-youtube-include-hls-manifest)
related data on YouTube videos
(Alias: --no-youtube-include-hls-manifest)
--merge-output-format FORMAT If a merge is required (e.g.
bestvideo+bestaudio), output to given
container format. One of mkv, mp4, ogg,
@@ -535,27 +576,30 @@ Then simply type this
## Post-Processing Options:
-x, --extract-audio Convert video files to audio-only files
(requires ffmpeg or avconv and ffprobe or
avprobe)
(requires ffmpeg and ffprobe)
--audio-format FORMAT Specify audio format: "best", "aac",
"flac", "mp3", "m4a", "opus", "vorbis", or
"wav"; "best" by default; No effect without
-x
--audio-quality QUALITY Specify ffmpeg/avconv audio quality, insert
a value between 0 (better) and 9 (worse)
for VBR or a specific bitrate like 128K
--audio-quality QUALITY Specify ffmpeg audio quality, insert a
value between 0 (better) and 9 (worse) for
VBR or a specific bitrate like 128K
(default 5)
--remux-video FORMAT Remux the video into another container if
necessary (currently supported: mp4|mkv).
If target container does not support the
video/audio codec, remuxing will fail
necessary (currently supported: mp4|mkv|flv
|webm|mov|avi|mp3|mka|m4a|ogg|opus). If
target container does not support the
video/audio codec, remuxing will fail. You
can specify multiple rules; eg.
"aac>m4a/mov>mp4/mkv" will remux aac to
m4a, mov to mp4 and anything else to mkv.
--recode-video FORMAT Re-encode the video into another format if
re-encoding is necessary (currently
supported: mp4|flv|ogg|webm|mkv|avi)
re-encoding is necessary. The supported
formats are the same as --remux-video
--postprocessor-args NAME:ARGS Give these arguments to the postprocessors.
Specify the postprocessor/executable name
and the arguments separated by a colon ':'
to give the argument to only the specified
and the arguments separated by a colon ":"
to give the argument to the specified
postprocessor/executable. Supported
postprocessors are: SponSkrub,
ExtractAudio, VideoRemuxer, VideoConvertor,
@@ -563,14 +607,14 @@ Then simply type this
FixupStretched, FixupM4a, FixupM3u8,
SubtitlesConvertor and EmbedThumbnail. The
supported executables are: SponSkrub,
FFmpeg, FFprobe, avconf, avprobe and
AtomicParsley. You can use this option
multiple times to give different arguments
to different postprocessors. You can also
specify "PP+EXE:ARGS" to give the arguments
to the specified executable only when being
used by the specified postprocessor (Alias:
--ppa)
FFmpeg, FFprobe, and AtomicParsley. You can
use this option multiple times to give
different arguments to different
postprocessors. You can also specify
"PP+EXE:ARGS" to give the arguments to the
specified executable only when being used
by the specified postprocessor. You can use
this option multiple times (Alias: --ppa)
-k, --keep-video Keep the intermediate video file on disk
after post-processing
--no-keep-video Delete the intermediate video file after
@@ -584,16 +628,20 @@ Then simply type this
--no-embed-thumbnail Do not embed thumbnail (default)
--add-metadata Write metadata to the video file
--no-add-metadata Do not write metadata (default)
--metadata-from-title FORMAT Parse additional metadata like song title /
artist from the video title. The format
syntax is the same as --output. Regular
expression with named capture groups may
--parse-metadata FIELD:FORMAT Parse additional metadata like title/artist
from other fields. Give field name to
extract data from, and format of the field
seperated by a ":". Either regular
expression with named capture groups or a
similar syntax to the output template can
also be used. The parsed parameters replace
existing values. Example: --metadata-from-
title "%(artist)s - %(title)s" matches a
any existing values and can be use in
output templateThis option can be used
multiple times. Example: --parse-metadata
"title:%(artist)s - %(title)s" matches a
title like "Coldplay - Paradise". Example
(regex): --metadata-from-title
"(?P<artist>.+?) - (?P<title>.+)"
(regex): --parse-metadata
"description:Artist - (?P<artist>.+?)"
--xattrs Write metadata to the video file's xattrs
(using dublin core and xdg standards)
--fixup POLICY Automatically correct known faults of the
@@ -601,15 +649,9 @@ Then simply type this
emit a warning), detect_or_warn (the
default; fix file if we can, warn
otherwise)
--prefer-avconv Prefer avconv over ffmpeg for running the
postprocessors (Alias: --no-prefer-ffmpeg)
--prefer-ffmpeg Prefer ffmpeg over avconv for running the
postprocessors (default)
(Alias: --no-prefer-avconv)
--ffmpeg-location PATH Location of the ffmpeg/avconv binary;
either the path to the binary or its
containing directory
(Alias: --avconv-location)
--ffmpeg-location PATH Location of the ffmpeg binary; either the
path to the binary or its containing
directory
--exec CMD Execute a command on the file after
downloading and post-processing, similar to
find's -exec syntax. Example: --exec 'adb
@@ -648,8 +690,9 @@ Then simply type this
You can configure youtube-dlc by placing any supported command line option to a configuration file. The configuration is loaded from the following locations:
1. The file given by `--config-location`
1. **Main Configuration**: The file given by `--config-location`
1. **Portable Configuration**: `yt-dlp.conf` or `youtube-dlc.conf` in the same directory as the bundled binary. If you are running from source-code (`<root dir>/youtube_dlc/__main__.py`), the root directory is used instead.
1. **Home Configuration**: `yt-dlp.conf` or `youtube-dlc.conf` in the home path given by `-P "home:<path>"`, or in the current directory if no such path is given
1. **User Configuration**:
* `%XDG_CONFIG_HOME%/yt-dlp/config` (recommended on Linux/macOS)
* `%XDG_CONFIG_HOME%/yt-dlp.conf`
@@ -707,11 +750,15 @@ set HOME=%USERPROFILE%
# OUTPUT TEMPLATE
The `-o` option allows users to indicate a template for the output file names.
The `-o` option is used to indicate a template for the output file names while `-P` option is used to specify the path each type of file should be saved to.
**tl;dr:** [navigate me to examples](#output-template-examples).
The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dlc -o funny_video.flv "https://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by formatting operations. Allowed names along with sequence type are:
The basic usage of `-o` is not to set any template arguments when downloading a single file, like in `youtube-dlc -o funny_video.flv "https://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by formatting operations. Date/time fields can also be formatted according to [strftime formatting](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) by specifying it inside the parantheses seperated from the field name using a `>`. For example, `%(duration>%H-%M-%S)s`.
Additionally, you can set different output templates for the various metadata files seperately from the general output template by specifying the type of file followed by the template seperated by a colon ":". The different filetypes supported are `subtitle|thumbnail|description|annotation|infojson|pl_description|pl_infojson`. For example, `-o '%(title)s.%(ext)s' -o 'thumbnail:%(title)s\%(title)s.%(ext)s'` will put the thumbnails in a folder with the same name as the video.
The available fields are:
- `id` (string): Video identifier
- `title` (string): Video title
@@ -741,7 +788,7 @@ The basic usage is not to set any template arguments when downloading a single f
- `is_live` (boolean): Whether this video is a live stream or a fixed-length video
- `start_time` (numeric): Time in seconds where the reproduction should start, as specified in the URL
- `end_time` (numeric): Time in seconds where the reproduction should end, as specified in the URL
- `format` (string): A human-readable description of the format
- `format` (string): A human-readable description of the format
- `format_id` (string): Format code specified by `--format`
- `format_note` (string): Additional info about the format
- `width` (numeric): Width of the video
@@ -798,7 +845,7 @@ Available for the media that is a track or a part of a music album:
- `disc_number` (numeric): Number of the disc or other physical medium the track belongs to
- `release_year` (numeric): Year (YYYY) when the album was released
Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with `NA`.
Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with placeholder value provided with `--output-na-placeholder` (`NA` by default).
For example for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `youtube-dlc test video` and id `BaW_jenozKcj`, this will result in a `youtube-dlc test video-BaW_jenozKcj.mp4` file created in the current directory.
@@ -818,7 +865,7 @@ If you are using an output template inside a Windows batch file then you must es
#### Output template examples
Note that on Windows you may need to use double quotes instead of single.
Note that on Windows you need to use double quotes instead of single.
```bash
$ youtube-dlc --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc
@@ -830,14 +877,17 @@ youtube-dlc_test_video_.mp4 # A simple file name
# Download YouTube playlist videos in separate directory indexed by video order in a playlist
$ youtube-dlc -o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re
# Download YouTube playlist videos in seperate directories according to their uploaded year
$ youtube-dlc -o '%(upload_date>%Y)s/%(title)s.%(ext)s' https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re
# Download all playlists of YouTube channel/user keeping each playlist in separate directory:
$ youtube-dlc -o '%(uploader)s/%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/user/TheLinuxFoundation/playlists
# Download Udemy course keeping each chapter in separate directory under MyVideos directory in your home
$ youtube-dlc -u user -p password -o '~/MyVideos/%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://www.udemy.com/java-tutorial/
$ youtube-dlc -u user -p password -P '~/MyVideos' -o '%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://www.udemy.com/java-tutorial/
# Download entire series season keeping each series and each season in separate directory under C:/MyVideos
$ youtube-dlc -o "C:/MyVideos/%(series)s/%(season_number)s - %(season)s/%(episode_number)s - %(episode)s.%(ext)s" https://videomore.ru/kino_v_detalayah/5_sezon/367617
$ youtube-dlc -P "C:/MyVideos" -o "%(series)s/%(season_number)s - %(season)s/%(episode_number)s - %(episode)s.%(ext)s" https://videomore.ru/kino_v_detalayah/5_sezon/367617
# Stream the video being downloaded to stdout
$ youtube-dlc -o - BaW_jenozKc
@@ -846,7 +896,7 @@ $ youtube-dlc -o - BaW_jenozKc
# FORMAT SELECTION
By default, youtube-dlc tries to download the best available quality if you **don't** pass any options.
This is generally equivalent to using `-f bestvideo*+bestaudio/best`. However, if multiple audiostreams is enabled (`--audio-multistreams`), the default format changes to `-f bestvideo+bestaudio/best`. Similarly, if ffmpeg and avconv are unavailable, or if you use youtube-dlc to stream to `stdout` (`-o -`), the default becomes `-f best/bestvideo+bestaudio`.
This is generally equivalent to using `-f bestvideo*+bestaudio/best`. However, if multiple audiostreams is enabled (`--audio-multistreams`), the default format changes to `-f bestvideo+bestaudio/best`. Similarly, if ffmpeg is unavailable, or if you use youtube-dlc to stream to `stdout` (`-o -`), the default becomes `-f best/bestvideo+bestaudio`.
The general syntax for format selection is `--f FORMAT` (or `--format FORMAT`) where `FORMAT` is a *selector expression*, i.e. an expression that describes format or formats you would like to download.
@@ -877,7 +927,7 @@ If you want to download multiple videos and they don't have the same formats ava
If you want to download several formats of the same video use a comma as a separator, e.g. `-f 22,17,18` will download all these three formats, of course if they are available. Or a more sophisticated example combined with the precedence feature: `-f 136/137/mp4/bestvideo,140/m4a/bestaudio`.
You can merge the video and audio of multiple formats into a single file using `-f <format1>+<format2>+...` (requires ffmpeg or avconv installed), for example `-f bestvideo+bestaudio` will download the best video-only format, the best audio-only format and mux them together with ffmpeg/avconv. If `--no-video-multistreams` is used, all formats with a video stream except the first one are ignored. Similarly, if `--no-audio-multistreams` is used, all formats with an audio stream except the first one are ignored. For example, `-f bestvideo+best+bestaudio` will download and merge all 3 given formats. The resulting file will have 2 video streams and 2 audio streams. But `-f bestvideo+best+bestaudio --no-video-multistreams` will download and merge only `bestvideo` and `bestaudio`. `best` is ignored since another format containing a video stream (`bestvideo`) has already been selected. The order of the formats is therefore important. `-f best+bestaudio --no-audio-multistreams` will download and merge both formats while `-f bestaudio+best --no-audio-multistreams` will ignore `best` and download only `bestaudio`.
You can merge the video and audio of multiple formats into a single file using `-f <format1>+<format2>+...` (requires ffmpeg installed), for example `-f bestvideo+bestaudio` will download the best video-only format, the best audio-only format and mux them together with ffmpeg. If `--no-video-multistreams` is used, all formats with a video stream except the first one are ignored. Similarly, if `--no-audio-multistreams` is used, all formats with an audio stream except the first one are ignored. For example, `-f bestvideo+best+bestaudio` will download and merge all 3 given formats. The resulting file will have 2 video streams and 2 audio streams. But `-f bestvideo+best+bestaudio --no-video-multistreams` will download and merge only `bestvideo` and `bestaudio`. `best` is ignored since another format containing a video stream (`bestvideo`) has already been selected. The order of the formats is therefore important. `-f best+bestaudio --no-audio-multistreams` will download and merge both formats while `-f bestaudio+best --no-audio-multistreams` will ignore `best` and download only `bestaudio`.
## Filtering Formats
@@ -916,35 +966,35 @@ Format selectors can also be grouped using parentheses, for example if you want
You can change the criteria for being considered the `best` by using `-S` (`--format-sort`). The general format for this is `--format-sort field1,field2...`. The available fields are:
- `video`, `has_video`: Gives priority to formats that has a video stream
- `audio`, `has_audio`: Gives priority to formats that has a audio stream
- `extractor`, `preference`, `extractor_preference`: The format preference as given by the extractor
- `lang`, `language_preference`: Language preference as given by the extractor
- `hasvid`: Gives priority to formats that has a video stream
- `hasaud`: Gives priority to formats that has a audio stream
- `ie_pref`: The format preference as given by the extractor
- `lang`: Language preference as given by the extractor
- `quality`: The quality of the format. This is a metadata field available in some websites
- `source`, `source_preference`: Preference of the source as given by the extractor
- `proto`, `protocol`: Protocol used for download (`https`/`ftps` > `http`/`ftp` > `m3u8-native` > `m3u8` > `http-dash-segments` > other > `mms`/`rtsp` > unknown > `f4f`/`f4m`)
- `vcodec`, `video_codec`: Video Codec (`vp9` > `h265` > `h264` > `vp8` > `h263` > `theora` > other > unknown)
- `acodec`, `audio_codec`: Audio Codec (`opus` > `vorbis` > `aac` > `mp4a` > `mp3` > `ac3` > `dts` > other > unknown)
- `source`: Preference of the source as given by the extractor
- `proto`: Protocol used for download (`https`/`ftps` > `http`/`ftp` > `m3u8-native` > `m3u8` > `http-dash-segments` > other > `mms`/`rtsp` > unknown > `f4f`/`f4m`)
- `vcodec`: Video Codec (`av01` > `vp9` > `h265` > `h264` > `vp8` > `h263` > `theora` > other > unknown)
- `acodec`: Audio Codec (`opus` > `vorbis` > `aac` > `mp4a` > `mp3` > `ac3` > `dts` > other > unknown)
- `codec`: Equivalent to `vcodec,acodec`
- `vext`, `video_ext`: Video Extension (`mp4` > `webm` > `flv` > other > unknown). If `--prefer-free-formats` is used, `webm` is prefered.
- `aext`, `audio_ext`: Audio Extension (`m4a` > `aac` > `mp3` > `ogg` > `opus` > `webm` > other > unknown). If `--prefer-free-formats` is used, the order changes to `opus` > `ogg` > `webm` > `m4a` > `mp3` > `aac`.
- `ext`, `extension`: Equivalent to `vext,aext`
- `vext`: Video Extension (`mp4` > `webm` > `flv` > other > unknown). If `--prefer-free-formats` is used, `webm` is prefered.
- `aext`: Audio Extension (`m4a` > `aac` > `mp3` > `ogg` > `opus` > `webm` > other > unknown). If `--prefer-free-formats` is used, the order changes to `opus` > `ogg` > `webm` > `m4a` > `mp3` > `aac`.
- `ext`: Equivalent to `vext,aext`
- `filesize`: Exact filesize, if know in advance. This will be unavailable for mu38 and DASH formats.
- `filesize_approx`: Approximate filesize calculated from the manifests
- `size`, `filesize_estimate`: Exact filesize if available, otherwise approximate filesize
- `fs_approx`: Approximate filesize calculated from the manifests
- `size`: Exact filesize if available, otherwise approximate filesize
- `height`: Height of video
- `width`: Width of video
- `res`, `dimension`: Video resolution, calculated as the smallest dimension.
- `fps`, `framerate`: Framerate of video
- `tbr`, `total_bitrate`: Total average bitrate in KBit/s
- `vbr`, `video_bitrate`: Average video bitrate in KBit/s
- `abr`, `audio_bitrate`: Average audio bitrate in KBit/s
- `br`, `bitrate`: Equivalent to using `tbr,vbr,abr`
- `samplerate`, `asr`: Audio sample rate in Hz
- `res`: Video resolution, calculated as the smallest dimension.
- `fps`: Framerate of video
- `tbr`: Total average bitrate in KBit/s
- `vbr`: Average video bitrate in KBit/s
- `abr`: Average audio bitrate in KBit/s
- `br`: Equivalent to using `tbr,vbr,abr`
- `asr`: Audio sample rate in Hz
Note that any other **numerical** field made available by the extractor can also be used. All fields, unless specified otherwise, are sorted in decending order. To reverse this, prefix the field with a `+`. Eg: `+res` prefers format with the smallest resolution. Additionally, you can suffix a prefered value for the fields, seperated by a `:`. Eg: `res:720` prefers larger videos, but no larger than 720p and the smallest video if there are no videos less than 720p. For `codec` and `ext`, you can provide two prefered values, the first for video and the second for audio. Eg: `+codec:avc:m4a` (equivalent to `+vcodec:avc,+acodec:m4a`) sets the video codec preference to `h264` > `h265` > `vp9` > `vp8` > `h263` > `theora` and audio codec preference to `mp4a` > `aac` > `vorbis` > `opus` > `mp3` > `ac3` > `dts`. You can also make the sorting prefer the nearest values to the provided by using `~` as the delimiter. Eg: `filesize~1G` prefers the format with filesize closest to 1 GiB.
The fields `has_video`, `extractor`, `lang`, `quality` are always given highest priority in sorting, irrespective of the user-defined order. This behaviour can be changed by using `--force-format-sort`. Apart from these, the default order used is: `res,fps,codec,size,br,asr,proto,ext,has_audio,source,format_id`. Note that the extractors may override this default order, but they cannot override the user-provided order.
The fields `hasvid`, `ie_pref`, `lang`, `quality` are always given highest priority in sorting, irrespective of the user-defined order. This behaviour can be changed by using `--force-format-sort`. Apart from these, the default order used is: `res,fps,codec:vp9,size,br,asr,proto,ext,hasaud,source,id`. Note that the extractors may override this default order, but they cannot override the user-provided order.
If your format selector is `worst`, the last item is selected after sorting. This means it will select the format that is worst in all repects. Most of the time, what you actually want is the video with the smallest filesize instead. So it is generally better to use `-f best -S +size,+br,+res,+fps`.
@@ -983,7 +1033,7 @@ $ youtube-dlc -f 'wv*+wa/w'
$ youtube-dlc -S '+res'
# Download the smallest video available
$ youtube-dlc -S '+size,+bitrate'
$ youtube-dlc -S '+size,+br'
@@ -1031,7 +1081,7 @@ $ youtube-dlc -f '(bv*+ba/b)[protocol^=http][protocol!*=dash] / (bv*+ba/b)'
# Download best video available via the best protocol
# (https/ftps > http/ftp > m3u8_native > m3u8 > http_dash_segments ...)
$ youtube-dlc -S 'protocol'
$ youtube-dlc -S 'proto'
@@ -1067,9 +1117,11 @@ $ youtube-dlc -S 'res:720,fps'
$ youtube-dlc -S '+res:480,codec,br'
```
# PLUGINS
Plugins are loaded from `<root-dir>/ytdlp_plugins/<type>/__init__.py`. Currently only `extractor` plugins are supported. Support for `downloader` and `postprocessor` plugins may be added in the future. See [ytdlp_plugins](ytdlp_plugins) for example.
**Note**: `<root-dir>` is the directory of the binary (`<root-dir>/youtube-dlc`), or the root directory of the module if you are running directly from source-code (`<root dir>/youtube_dlc/__main__.py`)
# MORE
For FAQ, Developer Instructions etc., see the [original README](https://github.com/ytdl-org/youtube-dl)
For FAQ, Developer Instructions etc., see the [original README](https://github.com/ytdl-org/youtube-dl#faq)

View File

Before

Width:  |  Height:  |  Size: 4.2 KiB

After

Width:  |  Height:  |  Size: 4.2 KiB

78
devscripts/pyinst.py Normal file
View File

@@ -0,0 +1,78 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
import sys
import os
import platform
from PyInstaller.utils.win32.versioninfo import (
VarStruct, VarFileInfo, StringStruct, StringTable,
StringFileInfo, FixedFileInfo, VSVersionInfo, SetVersion,
)
import PyInstaller.__main__
arch = sys.argv[1] if len(sys.argv) > 1 else platform.architecture()[0][:2]
assert arch in ('32', '64')
print('Building %sbit version' % arch)
_x86 = '_x86' if arch == '32' else ''
FILE_DESCRIPTION = 'Media Downloader%s' % (' (32 Bit)' if _x86 else '')
SHORT_URLS = {'32': 'git.io/JUGsM', '64': 'git.io/JLh7K'}
root_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
print('Changing working directory to %s' % root_dir)
os.chdir(root_dir)
exec(compile(open('youtube_dlc/version.py').read(), 'youtube_dlc/version.py', 'exec'))
VERSION = locals()['__version__']
VERSION_LIST = VERSION.replace('-', '.').split('.')
VERSION_LIST = list(map(int, VERSION_LIST)) + [0] * (4 - len(VERSION_LIST))
print('Version: %s%s' % (VERSION, _x86))
print('Remember to update the version using devscipts\\update-version.py')
VERSION_FILE = VSVersionInfo(
ffi=FixedFileInfo(
filevers=VERSION_LIST,
prodvers=VERSION_LIST,
mask=0x3F,
flags=0x0,
OS=0x4,
fileType=0x1,
subtype=0x0,
date=(0, 0),
),
kids=[
StringFileInfo([
StringTable(
'040904B0', [
StringStruct('Comments', 'Youtube-dlc%s Command Line Interface.' % _x86),
StringStruct('CompanyName', 'pukkandan@gmail.com'),
StringStruct('FileDescription', FILE_DESCRIPTION),
StringStruct('FileVersion', VERSION),
StringStruct('InternalName', 'youtube-dlc%s' % _x86),
StringStruct(
'LegalCopyright',
'pukkandan@gmail.com | UNLICENSE',
),
StringStruct('OriginalFilename', 'youtube-dlc%s.exe' % _x86),
StringStruct('ProductName', 'Youtube-dlc%s' % _x86),
StringStruct('ProductVersion', '%s%s | %s' % (VERSION, _x86, SHORT_URLS[arch])),
])]),
VarFileInfo([VarStruct('Translation', [0, 1200])])
]
)
PyInstaller.__main__.run([
'--name=youtube-dlc%s' % _x86,
'--onefile',
'--icon=devscripts/cloud.ico',
'--exclude-module=youtube_dl',
'--exclude-module=test',
'--exclude-module=ytdlp_plugins',
'--hidden-import=mutagen',
'youtube_dlc/__main__.py',
])
SetVersion('dist/youtube-dlc%s.exe' % _x86, VERSION_FILE)

View File

@@ -0,0 +1,31 @@
from __future__ import unicode_literals
from datetime import datetime
# import urllib.request
# response = urllib.request.urlopen('https://blackjack4494.github.io/youtube-dlc/update/LATEST_VERSION')
# old_version = response.read().decode('utf-8')
exec(compile(open('youtube_dlc/version.py').read(), 'youtube_dlc/version.py', 'exec'))
old_version = locals()['__version__']
old_version_list = old_version.replace('-', '.').split(".", 4)
old_ver = '.'.join(old_version_list[:3])
old_rev = old_version_list[3] if len(old_version_list) > 3 else ''
ver = datetime.now().strftime("%Y.%m.%d")
rev = str(int(old_rev or 0) + 1) if old_ver == ver else ''
VERSION = '.'.join((ver, rev)) if rev else ver
# VERSION_LIST = [(int(v) for v in ver.split(".") + [rev or 0])]
print('::set-output name=ytdlc_version::' + VERSION)
file_version_py = open('youtube_dlc/version.py', 'rt')
data = file_version_py.read()
data = data.replace(old_version, VERSION)
file_version_py.close()
file_version_py = open('youtube_dlc/version.py', 'wt')
file_version_py.write(data)
file_version_py.close()

View File

@@ -47,12 +47,13 @@
- **Amara**
- **AMCNetworks**
- **AmericasTestKitchen**
- **AmericasTestKitchenSeason**
- **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **AnimeLab**
- **AnimeLabShows**
- **AnimeOnDemand**
- **Anvato**
- **aol.com**
- **aol.com**: Yahoo screen and movies
- **APA**
- **Aparat**
- **AppleConnect**
@@ -79,6 +80,9 @@
- **AudioBoom**
- **audiomack**
- **audiomack:album**
- **Audius**: Audius.co
- **audius:playlist**: Audius.co playlists
- **audius:track**: Audius track ID or API link. Prepend with "audius:"
- **AWAAN**
- **awaan:live**
- **awaan:season**
@@ -111,7 +115,9 @@
- **BiliBili**
- **BilibiliAudio**
- **BilibiliAudioAlbum**
- **BilibiliChannel**
- **BiliBiliPlayer**
- **BiliBiliSearch**: Bilibili video search, "bilisearch" keyword
- **BioBioChileTV**
- **Biography**
- **BIQLE**
@@ -197,8 +203,6 @@
- **CNNArticle**
- **CNNBlogs**
- **ComedyCentral**
- **ComedyCentralFullEpisodes**
- **ComedyCentralShortname**
- **ComedyCentralTV**
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
- **CONtv**
@@ -520,6 +524,12 @@
- **Mgoon**
- **MGTV**: 芒果TV
- **MiaoPai**
- **mildom**: Record ongoing live by specific user in Mildom
- **mildom:user:vod**: Download all VODs from specific user in Mildom
- **mildom:vod**: Download a VOD in Mildom
- **minds**
- **minds:channel**
- **minds:group**
- **MinistryGrid**
- **Minoto**
- **miomio.tv**
@@ -549,6 +559,7 @@
- **mtv:video**
- **mtvjapan**
- **mtvservices:embedded**
- **MTVUutisetArticle**
- **MuenchenTV**: münchen.tv
- **mva**: Microsoft Virtual Academy videos
- **mva:course**: Microsoft Virtual Academy courses
@@ -880,6 +891,8 @@
- **Sport5**
- **SportBox**
- **SportDeutschland**
- **spotify**
- **spotify:show**
- **Spreaker**
- **SpreakerPage**
- **SpreakerShow**
@@ -962,13 +975,13 @@
- **TNAFlixNetworkEmbed**
- **toggle**
- **ToonGoggles**
- **Tosh**: Tosh.0
- **tou.tv**
- **Toypics**: Toypics video
- **ToypicsUser**: Toypics user profile
- **TrailerAddict** (Currently broken)
- **Trilulilu**
- **TrovoLive**
- **Trovo**
- **TrovoVod**
- **TruNews**
- **TruTV**
- **Tube8**
@@ -1078,7 +1091,6 @@
- **vidme**
- **vidme:user**
- **vidme:user:likes**
- **Vidzi**
- **vier**: vier.be and vijf.be
- **vier:videos**
- **viewlift**
@@ -1123,6 +1135,7 @@
- **vrv**
- **vrv:series**
- **VShare**
- **VTM**
- **VTXTV**
- **vube**: Vube.com
- **VuClip**

View File

@@ -1 +0,0 @@
py -m PyInstaller youtube_dlc\__main__.py --onefile --name youtube-dlc --version-file win\ver.txt --icon win\icon\cloud.ico --upx-exclude=vcruntime140.dll

View File

@@ -1,92 +0,0 @@
from __future__ import unicode_literals
from PyInstaller.utils.win32.versioninfo import (
VarStruct, VarFileInfo, StringStruct, StringTable,
StringFileInfo, FixedFileInfo, VSVersionInfo, SetVersion,
)
import PyInstaller.__main__
from datetime import datetime
FILE_DESCRIPTION = 'Media Downloader'
exec(compile(open('youtube_dlc/version.py').read(), 'youtube_dlc/version.py', 'exec'))
_LATEST_VERSION = locals()['__version__']
_OLD_VERSION = _LATEST_VERSION.rsplit("-", 1)
if len(_OLD_VERSION) > 0:
old_ver = _OLD_VERSION[0]
old_rev = ''
if len(_OLD_VERSION) > 1:
old_rev = _OLD_VERSION[1]
now = datetime.now()
# ver = f'{datetime.today():%Y.%m.%d}'
ver = now.strftime("%Y.%m.%d")
rev = ''
if old_ver == ver:
if old_rev:
rev = int(old_rev) + 1
else:
rev = 1
_SEPARATOR = '-'
version = _SEPARATOR.join(filter(None, [ver, str(rev)]))
print(version)
version_list = ver.split(".")
_year, _month, _day = [int(value) for value in version_list]
_rev = 0
if rev:
_rev = rev
_ver_tuple = _year, _month, _day, _rev
version_file = VSVersionInfo(
ffi=FixedFileInfo(
filevers=_ver_tuple,
prodvers=_ver_tuple,
mask=0x3F,
flags=0x0,
OS=0x4,
fileType=0x1,
subtype=0x0,
date=(0, 0),
),
kids=[
StringFileInfo(
[
StringTable(
"040904B0",
[
StringStruct("Comments", "Youtube-dlc Command Line Interface."),
StringStruct("CompanyName", "theidel@uni-bremen.de"),
StringStruct("FileDescription", FILE_DESCRIPTION),
StringStruct("FileVersion", version),
StringStruct("InternalName", "youtube-dlc"),
StringStruct(
"LegalCopyright",
"theidel@uni-bremen.de | UNLICENSE",
),
StringStruct("OriginalFilename", "youtube-dlc.exe"),
StringStruct("ProductName", "Youtube-dlc"),
StringStruct("ProductVersion", version + " | git.io/JLh7K"),
],
)
]
),
VarFileInfo([VarStruct("Translation", [0, 1200])])
]
)
PyInstaller.__main__.run([
'--name=youtube-dlc',
'--onefile',
'--icon=win/icon/cloud.ico',
'youtube_dlc/__main__.py',
])
SetVersion('dist/youtube-dlc.exe', version_file)

View File

@@ -1,92 +0,0 @@
from __future__ import unicode_literals
from PyInstaller.utils.win32.versioninfo import (
VarStruct, VarFileInfo, StringStruct, StringTable,
StringFileInfo, FixedFileInfo, VSVersionInfo, SetVersion,
)
import PyInstaller.__main__
from datetime import datetime
FILE_DESCRIPTION = 'Media Downloader 32 Bit Version'
exec(compile(open('youtube_dlc/version.py').read(), 'youtube_dlc/version.py', 'exec'))
_LATEST_VERSION = locals()['__version__']
_OLD_VERSION = _LATEST_VERSION.rsplit("-", 1)
if len(_OLD_VERSION) > 0:
old_ver = _OLD_VERSION[0]
old_rev = ''
if len(_OLD_VERSION) > 1:
old_rev = _OLD_VERSION[1]
now = datetime.now()
# ver = f'{datetime.today():%Y.%m.%d}'
ver = now.strftime("%Y.%m.%d")
rev = ''
if old_ver == ver:
if old_rev:
rev = int(old_rev) + 1
else:
rev = 1
_SEPARATOR = '-'
version = _SEPARATOR.join(filter(None, [ver, str(rev)]))
print(version)
version_list = ver.split(".")
_year, _month, _day = [int(value) for value in version_list]
_rev = 0
if rev:
_rev = rev
_ver_tuple = _year, _month, _day, _rev
version_file = VSVersionInfo(
ffi=FixedFileInfo(
filevers=_ver_tuple,
prodvers=_ver_tuple,
mask=0x3F,
flags=0x0,
OS=0x4,
fileType=0x1,
subtype=0x0,
date=(0, 0),
),
kids=[
StringFileInfo(
[
StringTable(
"040904B0",
[
StringStruct("Comments", "Youtube-dlc_x86 Command Line Interface."),
StringStruct("CompanyName", "theidel@uni-bremen.de"),
StringStruct("FileDescription", FILE_DESCRIPTION),
StringStruct("FileVersion", version),
StringStruct("InternalName", "youtube-dlc_x86"),
StringStruct(
"LegalCopyright",
"theidel@uni-bremen.de | UNLICENSE",
),
StringStruct("OriginalFilename", "youtube-dlc_x86.exe"),
StringStruct("ProductName", "Youtube-dlc_x86"),
StringStruct("ProductVersion", version + "_x86 | git.io/JUGsM"),
],
)
]
),
VarFileInfo([VarStruct("Translation", [0, 1200])])
]
)
PyInstaller.__main__.run([
'--name=youtube-dlc_x86',
'--onefile',
'--icon=win/icon/cloud.ico',
'youtube_dlc/__main__.py',
])
SetVersion('dist/youtube-dlc_x86.exe', version_file)

1
requirements.txt Normal file
View File

@@ -0,0 +1 @@
mutagen

View File

@@ -1,44 +0,0 @@
from __future__ import unicode_literals
from datetime import datetime
# import urllib.request
# response = urllib.request.urlopen('https://blackjack4494.github.io/youtube-dlc/update/LATEST_VERSION')
# _LATEST_VERSION = response.read().decode('utf-8')
exec(compile(open('youtube_dlc/version.py').read(), 'youtube_dlc/version.py', 'exec'))
_LATEST_VERSION = locals()['__version__']
_OLD_VERSION = _LATEST_VERSION.rsplit("-", 1)
if len(_OLD_VERSION) > 0:
old_ver = _OLD_VERSION[0]
old_rev = ''
if len(_OLD_VERSION) > 1:
old_rev = _OLD_VERSION[1]
now = datetime.now()
# ver = f'{datetime.today():%Y.%m.%d}'
ver = now.strftime("%Y.%m.%d")
rev = ''
if old_ver == ver:
if old_rev:
rev = int(old_rev) + 1
else:
rev = 1
_SEPARATOR = '-'
version = _SEPARATOR.join(filter(None, [ver, str(rev)]))
print('::set-output name=ytdlc_version::' + version)
file_version_py = open('youtube_dlc/version.py', 'rt')
data = file_version_py.read()
data = data.replace(locals()['__version__'], version)
file_version_py.close()
file_version_py = open('youtube_dlc/version.py', 'wt')
file_version_py.write(data)
file_version_py.close()

View File

@@ -1,33 +0,0 @@
# Unused
from __future__ import unicode_literals
from datetime import datetime
import urllib.request
response = urllib.request.urlopen('https://blackjack4494.github.io/youtube-dlc/update/LATEST_VERSION')
_LATEST_VERSION = response.read().decode('utf-8')
_OLD_VERSION = _LATEST_VERSION.rsplit("-", 1)
if len(_OLD_VERSION) > 0:
old_ver = _OLD_VERSION[0]
old_rev = ''
if len(_OLD_VERSION) > 1:
old_rev = _OLD_VERSION[1]
now = datetime.now()
# ver = f'{datetime.today():%Y.%m.%d}'
ver = now.strftime("%Y.%m.%d")
rev = ''
if old_ver == ver:
if old_rev:
rev = int(old_rev) + 1
else:
rev = 1
_SEPARATOR = '-'
version = _SEPARATOR.join(filter(None, [ver, str(rev)]))

View File

@@ -2,5 +2,5 @@
universal = True
[flake8]
exclude = youtube_dlc/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv,devscripts/create-github-release.py,devscripts/release.sh,devscripts/show-downloads-statistics.py,scripts/update-version.py
exclude = youtube_dlc/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv,devscripts/create-github-release.py,devscripts/release.sh,devscripts/show-downloads-statistics.py
ignore = E402,E501,E731,E741,W503

View File

@@ -7,10 +7,12 @@ import warnings
import sys
from distutils.spawn import spawn
# Get the version from youtube_dlc/version.py without importing the package
exec(compile(open('youtube_dlc/version.py').read(),
'youtube_dlc/version.py', 'exec'))
DESCRIPTION = 'Command-line program to download videos from YouTube.com and many other other video platforms.'
LONG_DESCRIPTION = '\n\n'.join((
@@ -18,6 +20,9 @@ LONG_DESCRIPTION = '\n\n'.join((
'**PS**: Many links in this document will not work since this is a copy of the README.md from Github',
open("README.md", "r", encoding="utf-8").read()))
REQUIREMENTS = ['mutagen']
if len(sys.argv) >= 2 and sys.argv[1] == 'py2exe':
print("inv")
else:
@@ -41,10 +46,8 @@ else:
params = {
'data_files': data_files,
}
#if setuptools_available:
params['entry_points'] = {'console_scripts': ['youtube-dlc = youtube_dlc:main']}
#else:
# params['scripts'] = ['bin/youtube-dlc']
class build_lazy_extractors(Command):
description = 'Build the extractor lazy loading module'
@@ -62,6 +65,9 @@ class build_lazy_extractors(Command):
dry_run=self.dry_run,
)
packages = find_packages(exclude=("youtube_dl", "test", "ytdlp_plugins"))
setup(
name="yt-dlp",
version=__version__,
@@ -71,7 +77,8 @@ setup(
long_description=LONG_DESCRIPTION,
long_description_content_type="text/markdown",
url="https://github.com/pukkandan/yt-dlp",
packages=find_packages(exclude=("youtube_dl","test",)),
packages=packages,
install_requires=REQUIREMENTS,
project_urls={
'Documentation': 'https://github.com/pukkandan/yt-dlp#yt-dlp',
'Source': 'https://github.com/pukkandan/yt-dlp',

View File

@@ -637,13 +637,20 @@ class TestYoutubeDL(unittest.TestCase):
'title2': '%PATH%',
}
def fname(templ):
ydl = YoutubeDL({'outtmpl': templ})
def fname(templ, na_placeholder='NA'):
params = {'outtmpl': templ}
if na_placeholder != 'NA':
params['outtmpl_na_placeholder'] = na_placeholder
ydl = YoutubeDL(params)
return ydl.prepare_filename(info)
self.assertEqual(fname('%(id)s.%(ext)s'), '1234.mp4')
self.assertEqual(fname('%(id)s-%(width)s.%(ext)s'), '1234-NA.mp4')
# Replace missing fields with 'NA'
self.assertEqual(fname('%(uploader_date)s-%(id)s.%(ext)s'), 'NA-1234.mp4')
NA_TEST_OUTTMPL = '%(uploader_date)s-%(width)d-%(id)s.%(ext)s'
# Replace missing fields with 'NA' by default
self.assertEqual(fname(NA_TEST_OUTTMPL), 'NA-NA-1234.mp4')
# Or by provided placeholder
self.assertEqual(fname(NA_TEST_OUTTMPL, na_placeholder='none'), 'none-none-1234.mp4')
self.assertEqual(fname(NA_TEST_OUTTMPL, na_placeholder=''), '--1234.mp4')
self.assertEqual(fname('%(height)d.%(ext)s'), '1080.mp4')
self.assertEqual(fname('%(height)6d.%(ext)s'), ' 1080.mp4')
self.assertEqual(fname('%(height)-6d.%(ext)s'), '1080 .mp4')

View File

@@ -8,11 +8,11 @@ import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import get_params, try_rm
import youtube_dl.YoutubeDL
from youtube_dl.utils import DownloadError
import youtube_dlc.YoutubeDL
from youtube_dlc.utils import DownloadError
class YoutubeDL(youtube_dl.YoutubeDL):
class YoutubeDL(youtube_dlc.YoutubeDL):
def __init__(self, *args, **kwargs):
super(YoutubeDL, self).__init__(*args, **kwargs)
self.to_stderr = self.to_screen

View File

@@ -8,10 +8,16 @@ import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dlc.postprocessor import MetadataFromTitlePP
from youtube_dlc.postprocessor import MetadataFromFieldPP, MetadataFromTitlePP
class TestMetadataFromField(unittest.TestCase):
def test_format_to_regex(self):
pp = MetadataFromFieldPP(None, ['title:%(title)s - %(artist)s'])
self.assertEqual(pp._data[0]['regex'], r'(?P<title>[^\r\n]+)\ \-\ (?P<artist>[^\r\n]+)')
class TestMetadataFromTitle(unittest.TestCase):
def test_format_to_regex(self):
pp = MetadataFromTitlePP(None, '%(title)s - %(artist)s')
self.assertEqual(pp._titleregex, r'(?P<title>.+)\ \-\ (?P<artist>.+)')
self.assertEqual(pp._titleregex, r'(?P<title>[^\r\n]+)\ \-\ (?P<artist>[^\r\n]+)')

View File

@@ -15,8 +15,6 @@ IGNORED_FILES = [
'setup.py', # http://bugs.python.org/issue13943
'conf.py',
'buildserver.py',
'pyinst.py',
'pyinst32.py',
]
IGNORED_DIRS = [

View File

@@ -1,275 +0,0 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import expect_value
from youtube_dlc.extractor import YoutubeIE
class TestYoutubeChapters(unittest.TestCase):
_TEST_CASES = [
(
# https://www.youtube.com/watch?v=A22oy8dFjqc
# pattern: 00:00 - <title>
'''This is the absolute ULTIMATE experience of Queen's set at LIVE AID, this is the best video mixed to the absolutely superior stereo radio broadcast. This vastly superior audio mix takes a huge dump on all of the official mixes. Best viewed in 1080p. ENJOY! ***MAKE SURE TO READ THE DESCRIPTION***<br /><a href="#" onclick="yt.www.watch.player.seekTo(00*60+36);return false;">00:36</a> - Bohemian Rhapsody<br /><a href="#" onclick="yt.www.watch.player.seekTo(02*60+42);return false;">02:42</a> - Radio Ga Ga<br /><a href="#" onclick="yt.www.watch.player.seekTo(06*60+53);return false;">06:53</a> - Ay Oh!<br /><a href="#" onclick="yt.www.watch.player.seekTo(07*60+34);return false;">07:34</a> - Hammer To Fall<br /><a href="#" onclick="yt.www.watch.player.seekTo(12*60+08);return false;">12:08</a> - Crazy Little Thing Called Love<br /><a href="#" onclick="yt.www.watch.player.seekTo(16*60+03);return false;">16:03</a> - We Will Rock You<br /><a href="#" onclick="yt.www.watch.player.seekTo(17*60+18);return false;">17:18</a> - We Are The Champions<br /><a href="#" onclick="yt.www.watch.player.seekTo(21*60+12);return false;">21:12</a> - Is This The World We Created...?<br /><br />Short song analysis:<br /><br />- "Bohemian Rhapsody": Although it's a short medley version, it's one of the best performances of the ballad section, with Freddie nailing the Bb4s with the correct studio phrasing (for the first time ever!).<br /><br />- "Radio Ga Ga": Although it's missing one chorus, this is one of - if not the best - the best versions ever, Freddie nails all the Bb4s and sounds very clean! Spike Edney's Roland Jupiter 8 also really shines through on this mix, compared to the DVD releases!<br /><br />- "Audience Improv": A great improv, Freddie sounds strong and confident. You gotta love when he sustains that A4 for 4 seconds!<br /><br />- "Hammer To Fall": Despite missing a verse and a chorus, it's a strong version (possibly the best ever). Freddie sings the song amazingly, and even ad-libs a C#5 and a C5! Also notice how heavy Brian's guitar sounds compared to the thin DVD mixes - it roars!<br /><br />- "Crazy Little Thing Called Love": A great version, the crowd loves the song, the jam is great as well! Only downside to this is the slight feedback issues.<br /><br />- "We Will Rock You": Although cut down to the 1st verse and chorus, Freddie sounds strong. He nails the A4, and the solo from Dr. May is brilliant!<br /><br />- "We Are the Champions": Perhaps the high-light of the performance - Freddie is very daring on this version, he sustains the pre-chorus Bb4s, nails the 1st C5, belts great A4s, but most importantly: He nails the chorus Bb4s, in all 3 choruses! This is the only time he has ever done so! It has to be said though, the last one sounds a bit rough, but that's a side effect of belting high notes for the past 18 minutes, with nodules AND laryngitis!<br /><br />- "Is This The World We Created... ?": Freddie and Brian perform a beautiful version of this, and it is one of the best versions ever. It's both sad and hilarious that a couple of BBC engineers are talking over the song, one of them being completely oblivious of the fact that he is interrupting the performance, on live television... Which was being televised to almost 2 billion homes.<br /><br /><br />All rights go to their respective owners!<br />-----Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for fair use for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use''',
1477,
[{
'start_time': 36,
'end_time': 162,
'title': 'Bohemian Rhapsody',
}, {
'start_time': 162,
'end_time': 413,
'title': 'Radio Ga Ga',
}, {
'start_time': 413,
'end_time': 454,
'title': 'Ay Oh!',
}, {
'start_time': 454,
'end_time': 728,
'title': 'Hammer To Fall',
}, {
'start_time': 728,
'end_time': 963,
'title': 'Crazy Little Thing Called Love',
}, {
'start_time': 963,
'end_time': 1038,
'title': 'We Will Rock You',
}, {
'start_time': 1038,
'end_time': 1272,
'title': 'We Are The Champions',
}, {
'start_time': 1272,
'end_time': 1477,
'title': 'Is This The World We Created...?',
}]
),
(
# https://www.youtube.com/watch?v=ekYlRhALiRQ
# pattern: <num>. <title> 0:00
'1. Those Beaten Paths of Confusion <a href="#" onclick="yt.www.watch.player.seekTo(0*60+00);return false;">0:00</a><br />2. Beyond the Shadows of Emptiness & Nothingness <a href="#" onclick="yt.www.watch.player.seekTo(11*60+47);return false;">11:47</a><br />3. Poison Yourself...With Thought <a href="#" onclick="yt.www.watch.player.seekTo(26*60+30);return false;">26:30</a><br />4. The Agents of Transformation <a href="#" onclick="yt.www.watch.player.seekTo(35*60+57);return false;">35:57</a><br />5. Drowning in the Pain of Consciousness <a href="#" onclick="yt.www.watch.player.seekTo(44*60+32);return false;">44:32</a><br />6. Deny the Disease of Life <a href="#" onclick="yt.www.watch.player.seekTo(53*60+07);return false;">53:07</a><br /><br />More info/Buy: http://crepusculonegro.storenvy.com/products/257645-cn-03-arizmenda-within-the-vacuum-of-infinity<br /><br />No copyright is intended. The rights to this video are assumed by the owner and its affiliates.',
4009,
[{
'start_time': 0,
'end_time': 707,
'title': '1. Those Beaten Paths of Confusion',
}, {
'start_time': 707,
'end_time': 1590,
'title': '2. Beyond the Shadows of Emptiness & Nothingness',
}, {
'start_time': 1590,
'end_time': 2157,
'title': '3. Poison Yourself...With Thought',
}, {
'start_time': 2157,
'end_time': 2672,
'title': '4. The Agents of Transformation',
}, {
'start_time': 2672,
'end_time': 3187,
'title': '5. Drowning in the Pain of Consciousness',
}, {
'start_time': 3187,
'end_time': 4009,
'title': '6. Deny the Disease of Life',
}]
),
(
# https://www.youtube.com/watch?v=WjL4pSzog9w
# pattern: 00:00 <title>
'<a href="https://arizmenda.bandcamp.com/merch/despairs-depths-descended-cd" class="yt-uix-servicelink " data-target-new-window="True" data-servicelink="CDAQ6TgYACITCNf1raqT2dMCFdRjGAod_o0CBSj4HQ" data-url="https://arizmenda.bandcamp.com/merch/despairs-depths-descended-cd" rel="nofollow noopener" target="_blank">https://arizmenda.bandcamp.com/merch/...</a><br /><br /><a href="#" onclick="yt.www.watch.player.seekTo(00*60+00);return false;">00:00</a> Christening Unborn Deformities <br /><a href="#" onclick="yt.www.watch.player.seekTo(07*60+08);return false;">07:08</a> Taste of Purity<br /><a href="#" onclick="yt.www.watch.player.seekTo(16*60+16);return false;">16:16</a> Sculpting Sins of a Universal Tongue<br /><a href="#" onclick="yt.www.watch.player.seekTo(24*60+45);return false;">24:45</a> Birth<br /><a href="#" onclick="yt.www.watch.player.seekTo(31*60+24);return false;">31:24</a> Neves<br /><a href="#" onclick="yt.www.watch.player.seekTo(37*60+55);return false;">37:55</a> Libations in Limbo',
2705,
[{
'start_time': 0,
'end_time': 428,
'title': 'Christening Unborn Deformities',
}, {
'start_time': 428,
'end_time': 976,
'title': 'Taste of Purity',
}, {
'start_time': 976,
'end_time': 1485,
'title': 'Sculpting Sins of a Universal Tongue',
}, {
'start_time': 1485,
'end_time': 1884,
'title': 'Birth',
}, {
'start_time': 1884,
'end_time': 2275,
'title': 'Neves',
}, {
'start_time': 2275,
'end_time': 2705,
'title': 'Libations in Limbo',
}]
),
(
# https://www.youtube.com/watch?v=o3r1sn-t3is
# pattern: <title> 00:00 <note>
'Download this show in MP3: <a href="http://sh.st/njZKK" class="yt-uix-servicelink " data-url="http://sh.st/njZKK" data-target-new-window="True" data-servicelink="CDAQ6TgYACITCK3j8_6o2dMCFVDCGAoduVAKKij4HQ" rel="nofollow noopener" target="_blank">http://sh.st/njZKK</a><br /><br />Setlist:<br />I-E-A-I-A-I-O <a href="#" onclick="yt.www.watch.player.seekTo(00*60+45);return false;">00:45</a><br />Suite-Pee <a href="#" onclick="yt.www.watch.player.seekTo(4*60+26);return false;">4:26</a> (Incomplete)<br />Attack <a href="#" onclick="yt.www.watch.player.seekTo(5*60+31);return false;">5:31</a> (First live performance since 2011)<br />Prison Song <a href="#" onclick="yt.www.watch.player.seekTo(8*60+42);return false;">8:42</a><br />Know <a href="#" onclick="yt.www.watch.player.seekTo(12*60+32);return false;">12:32</a> (First live performance since 2011)<br />Aerials <a href="#" onclick="yt.www.watch.player.seekTo(15*60+32);return false;">15:32</a><br />Soldier Side - Intro <a href="#" onclick="yt.www.watch.player.seekTo(19*60+13);return false;">19:13</a><br />B.Y.O.B. <a href="#" onclick="yt.www.watch.player.seekTo(20*60+09);return false;">20:09</a><br />Soil <a href="#" onclick="yt.www.watch.player.seekTo(24*60+32);return false;">24:32</a><br />Darts <a href="#" onclick="yt.www.watch.player.seekTo(27*60+48);return false;">27:48</a><br />Radio/Video <a href="#" onclick="yt.www.watch.player.seekTo(30*60+38);return false;">30:38</a><br />Hypnotize <a href="#" onclick="yt.www.watch.player.seekTo(35*60+05);return false;">35:05</a><br />Temper <a href="#" onclick="yt.www.watch.player.seekTo(38*60+08);return false;">38:08</a> (First live performance since 1999)<br />CUBErt <a href="#" onclick="yt.www.watch.player.seekTo(41*60+00);return false;">41:00</a><br />Needles <a href="#" onclick="yt.www.watch.player.seekTo(42*60+57);return false;">42:57</a><br />Deer Dance <a href="#" onclick="yt.www.watch.player.seekTo(46*60+27);return false;">46:27</a><br />Bounce <a href="#" onclick="yt.www.watch.player.seekTo(49*60+38);return false;">49:38</a><br />Suggestions <a href="#" onclick="yt.www.watch.player.seekTo(51*60+25);return false;">51:25</a><br />Psycho <a href="#" onclick="yt.www.watch.player.seekTo(53*60+52);return false;">53:52</a><br />Chop Suey! <a href="#" onclick="yt.www.watch.player.seekTo(58*60+13);return false;">58:13</a><br />Lonely Day <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+01*60+15);return false;">1:01:15</a><br />Question! <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+04*60+14);return false;">1:04:14</a><br />Lost in Hollywood <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+08*60+10);return false;">1:08:10</a><br />Vicinity of Obscenity <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+13*60+40);return false;">1:13:40</a>(First live performance since 2012)<br />Forest <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+16*60+17);return false;">1:16:17</a><br />Cigaro <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+20*60+02);return false;">1:20:02</a><br />Toxicity <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+23*60+57);return false;">1:23:57</a>(with Chino Moreno)<br />Sugar <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+27*60+53);return false;">1:27:53</a>',
5640,
[{
'start_time': 45,
'end_time': 266,
'title': 'I-E-A-I-A-I-O',
}, {
'start_time': 266,
'end_time': 331,
'title': 'Suite-Pee (Incomplete)',
}, {
'start_time': 331,
'end_time': 522,
'title': 'Attack (First live performance since 2011)',
}, {
'start_time': 522,
'end_time': 752,
'title': 'Prison Song',
}, {
'start_time': 752,
'end_time': 932,
'title': 'Know (First live performance since 2011)',
}, {
'start_time': 932,
'end_time': 1153,
'title': 'Aerials',
}, {
'start_time': 1153,
'end_time': 1209,
'title': 'Soldier Side - Intro',
}, {
'start_time': 1209,
'end_time': 1472,
'title': 'B.Y.O.B.',
}, {
'start_time': 1472,
'end_time': 1668,
'title': 'Soil',
}, {
'start_time': 1668,
'end_time': 1838,
'title': 'Darts',
}, {
'start_time': 1838,
'end_time': 2105,
'title': 'Radio/Video',
}, {
'start_time': 2105,
'end_time': 2288,
'title': 'Hypnotize',
}, {
'start_time': 2288,
'end_time': 2460,
'title': 'Temper (First live performance since 1999)',
}, {
'start_time': 2460,
'end_time': 2577,
'title': 'CUBErt',
}, {
'start_time': 2577,
'end_time': 2787,
'title': 'Needles',
}, {
'start_time': 2787,
'end_time': 2978,
'title': 'Deer Dance',
}, {
'start_time': 2978,
'end_time': 3085,
'title': 'Bounce',
}, {
'start_time': 3085,
'end_time': 3232,
'title': 'Suggestions',
}, {
'start_time': 3232,
'end_time': 3493,
'title': 'Psycho',
}, {
'start_time': 3493,
'end_time': 3675,
'title': 'Chop Suey!',
}, {
'start_time': 3675,
'end_time': 3854,
'title': 'Lonely Day',
}, {
'start_time': 3854,
'end_time': 4090,
'title': 'Question!',
}, {
'start_time': 4090,
'end_time': 4420,
'title': 'Lost in Hollywood',
}, {
'start_time': 4420,
'end_time': 4577,
'title': 'Vicinity of Obscenity (First live performance since 2012)',
}, {
'start_time': 4577,
'end_time': 4802,
'title': 'Forest',
}, {
'start_time': 4802,
'end_time': 5037,
'title': 'Cigaro',
}, {
'start_time': 5037,
'end_time': 5273,
'title': 'Toxicity (with Chino Moreno)',
}, {
'start_time': 5273,
'end_time': 5640,
'title': 'Sugar',
}]
),
(
# https://www.youtube.com/watch?v=PkYLQbsqCE8
# pattern: <num> - <title> [<latinized title>] 0:00:00
'''Затемно (Zatemno) is an Obscure Black Metal Band from Russia.<br /><br />"Во прах (Vo prakh)'' Into The Ashes", Debut mini-album released may 6, 2016, by Death Knell Productions<br />Released on 6 panel digipak CD, limited to 100 copies only<br />And digital format on Bandcamp<br /><br />Tracklist<br /><br />1 - Во прах [Vo prakh] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+00*60+00);return false;">0:00:00</a><br />2 - Искупление [Iskupleniye] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+08*60+10);return false;">0:08:10</a><br />3 - Из серпов луны...[Iz serpov luny] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+14*60+30);return false;">0:14:30</a><br /><br />Links:<br /><a href="https://deathknellprod.bandcamp.com/album/--2" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://deathknellprod.bandcamp.com/album/--2" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://deathknellprod.bandcamp.com/a...</a><br /><a href="https://www.facebook.com/DeathKnellProd/" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://www.facebook.com/DeathKnellProd/" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://www.facebook.com/DeathKnellProd/</a><br /><br /><br />I don't have any right about this artifact, my only intention is to spread the music of the band, all rights are reserved to the Затемно (Zatemno) and his producers, Death Knell Productions.<br /><br />------------------------------------------------------------------<br /><br />Subscribe for more videos like this.<br />My link: <a href="https://web.facebook.com/AttackOfTheDragons" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://web.facebook.com/AttackOfTheDragons" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://web.facebook.com/AttackOfTheD...</a>''',
1138,
[{
'start_time': 0,
'end_time': 490,
'title': '1 - Во прах [Vo prakh]',
}, {
'start_time': 490,
'end_time': 870,
'title': '2 - Искупление [Iskupleniye]',
}, {
'start_time': 870,
'end_time': 1138,
'title': '3 - Из серпов луны...[Iz serpov luny]',
}]
),
(
# https://www.youtube.com/watch?v=xZW70zEasOk
# time point more than duration
'''● LCS Spring finals: Saturday and Sunday from <a href="#" onclick="yt.www.watch.player.seekTo(13*60+30);return false;">13:30</a> outside the venue! <br />● PAX East: Fri, Sat & Sun - more info in tomorrows video on the main channel!''',
283,
[]
),
]
def test_youtube_chapters(self):
for description, duration, expected_chapters in self._TEST_CASES:
ie = YoutubeIE()
expect_value(
self, ie._extract_chapters_from_description(description, duration),
expected_chapters, None)
if __name__ == '__main__':
unittest.main()

View File

@@ -86,13 +86,9 @@ class TestPlayerInfo(unittest.TestCase):
('https://www.youtube.com/yts/jsbin/player-en_US-vflaxXRn1/base.js', 'vflaxXRn1'),
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js', 'vflXGBaUN'),
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'vflKjOTVq'),
('http://s.ytimg.com/yt/swfbin/watch_as3-vflrEm9Nq.swf', 'vflrEm9Nq'),
('https://s.ytimg.com/yts/swfbin/player-vflenCdZL/watch_as3.swf', 'vflenCdZL'),
)
for player_url, expected_player_id in PLAYER_URLS:
expected_player_type = player_url.split('.')[-1]
player_type, player_id = YoutubeIE._extract_player_info(player_url)
self.assertEqual(player_type, expected_player_type)
player_id = YoutubeIE._extract_player_info(player_url)
self.assertEqual(player_id, expected_player_id)

View File

@@ -1,45 +0,0 @@
# UTF-8
#
# For more details about fixed file info 'ffi' see:
# http://msdn.microsoft.com/en-us/library/ms646997.aspx
VSVersionInfo(
ffi=FixedFileInfo(
# filevers and prodvers should be always a tuple with four items: (1, 2, 3, 4)
# Set not needed items to zero 0.
filevers=(16, 9, 2020, 0),
prodvers=(16, 9, 2020, 0),
# Contains a bitmask that specifies the valid bits 'flags'r
mask=0x3f,
# Contains a bitmask that specifies the Boolean attributes of the file.
flags=0x0,
# The operating system for which this file was designed.
# 0x4 - NT and there is no need to change it.
# OS=0x40004,
OS=0x4,
# The general type of file.
# 0x1 - the file is an application.
fileType=0x1,
# The function of the file.
# 0x0 - the function is not defined for this fileType
subtype=0x0,
# Creation date and time stamp.
date=(0, 0)
),
kids=[
StringFileInfo(
[
StringTable(
u'040904B0',
[StringStruct(u'Comments', u'Youtube-dlc Command Line Interface.'),
StringStruct(u'CompanyName', u'theidel@uni-bremen.de'),
StringStruct(u'FileDescription', u'Media Downloader'),
StringStruct(u'FileVersion', u'16.9.2020.0'),
StringStruct(u'InternalName', u'youtube-dlc'),
StringStruct(u'LegalCopyright', u'theidel@uni-bremen.de | UNLICENSE'),
StringStruct(u'OriginalFilename', u'youtube-dlc.exe'),
StringStruct(u'ProductName', u'Youtube-dlc'),
StringStruct(u'ProductVersion', u'16.9.2020.0 | git.io/JUGsM')])
]),
VarFileInfo([VarStruct(u'Translation', [0, 1200])])
]
)

View File

@@ -49,6 +49,7 @@ from .utils import (
date_from_str,
DateRange,
DEFAULT_OUTTMPL,
OUTTMPL_TYPES,
determine_ext,
determine_protocol,
DOT_DESKTOP_LINK_TEMPLATE,
@@ -61,6 +62,7 @@ from .utils import (
ExistingVideoReached,
expand_path,
ExtractorError,
float_or_none,
format_bytes,
format_field,
formatSeconds,
@@ -69,6 +71,7 @@ from .utils import (
iri_to_uri,
ISO3166Utils,
locked_file,
make_dir,
make_HTTPS_handler,
MaxDownloadsReached,
orderedSet,
@@ -90,6 +93,7 @@ from .utils import (
sanitized_Request,
std_headers,
str_or_none,
strftime_or_none,
subtitles_filename,
to_high_limit_path,
UnavailableVideoError,
@@ -104,7 +108,7 @@ from .utils import (
process_communicate_or_kill,
)
from .cache import Cache
from .extractor import get_info_extractor, gen_extractor_classes, _LAZY_LOADER
from .extractor import get_info_extractor, gen_extractor_classes, _LAZY_LOADER, _PLUGIN_CLASSES
from .extractor.openload import PhantomJSwrapper
from .downloader import get_suitable_downloader
from .downloader.rtmp import rtmpdump_version
@@ -114,8 +118,9 @@ from .postprocessor import (
FFmpegFixupStretchedPP,
FFmpegMergerPP,
FFmpegPostProcessor,
FFmpegSubtitlesConvertorPP,
# FFmpegSubtitlesConvertorPP,
get_postprocessor,
MoveFilesAfterDownloadPP,
)
from .version import __version__
@@ -170,18 +175,26 @@ class YoutubeDL(object):
forcejson: Force printing info_dict as JSON.
dump_single_json: Force printing the info_dict of the whole playlist
(or video) as a single JSON line.
force_write_download_archive: Force writing download archive regardless of
'skip_download' or 'simulate'.
force_write_download_archive: Force writing download archive regardless
of 'skip_download' or 'simulate'.
simulate: Do not download the video files.
format: Video format code. see "FORMAT SELECTION" for more details.
format_sort: How to sort the video formats. see "Sorting Formats" for more details.
format_sort_force: Force the given format_sort. see "Sorting Formats" for more details.
allow_multiple_video_streams: Allow multiple video streams to be merged into a single file
allow_multiple_audio_streams: Allow multiple audio streams to be merged into a single file
outtmpl: Template for output names.
restrictfilenames: Do not allow "&" and spaces in file names.
trim_file_name: Limit length of filename (extension excluded).
ignoreerrors: Do not stop on download errors. (Default True when running youtube-dlc, but False when directly accessing YoutubeDL class)
format_sort: How to sort the video formats. see "Sorting Formats"
for more details.
format_sort_force: Force the given format_sort. see "Sorting Formats"
for more details.
allow_multiple_video_streams: Allow multiple video streams to be merged
into a single file
allow_multiple_audio_streams: Allow multiple audio streams to be merged
into a single file
outtmpl: Dictionary of templates for output names. Allowed keys
are 'default' and the keys of OUTTMPL_TYPES (in utils.py)
outtmpl_na_placeholder: Placeholder for unavailable meta fields.
restrictfilenames: Do not allow "&" and spaces in file names
trim_file_name: Limit length of filename (extension excluded)
ignoreerrors: Do not stop on download errors
(Default True when running youtube-dlc,
but False when directly accessing YoutubeDL class)
force_generic_extractor: Force downloader to use the generic extractor
overwrites: Overwrite all video and metadata files if True,
overwrite only non-video files if None
@@ -197,8 +210,12 @@ class YoutubeDL(object):
logtostderr: Log messages to stderr instead of stdout.
writedescription: Write the video description to a .description file
writeinfojson: Write the video description to a .info.json file
writecomments: Extract video comments. This will not be written to disk
unless writeinfojson is also given
writeannotations: Write the video annotations to a .annotations.xml file
writethumbnail: Write the thumbnail image to a file
allow_playlist_files: Whether to write playlists' description, infojson etc
also to disk when using the 'write*' options
write_all_thumbnails: Write all thumbnail formats to files
writelink: Write an internet shortcut file, depending on the
current platform (.url/.webloc/.desktop)
@@ -257,6 +274,8 @@ class YoutubeDL(object):
postprocessors: A list of dictionaries, each with an entry
* key: The name of the postprocessor. See
youtube_dlc/postprocessor/__init__.py for a list.
* _after_move: Optional. If True, run this post_processor
after 'MoveFilesAfterDownload'
as well as any further keyword arguments for the
postprocessor.
post_hooks: A list of functions that get called as the final step
@@ -287,6 +306,9 @@ class YoutubeDL(object):
Progress hooks are guaranteed to be called at least once
(with status "finished") if the download is successful.
merge_output_format: Extension to use when merging formats.
final_ext: Expected final extension; used to detect when the file was
already downloaded and converted. "merge_output_format" is
replaced by this extension when given
fixup: Automatically correct known faults of the file.
One of:
- "never": do nothing
@@ -340,7 +362,7 @@ class YoutubeDL(object):
The following options are used by the post processors:
prefer_ffmpeg: If False, use avconv instead of ffmpeg if both are available,
otherwise prefer ffmpeg.
otherwise prefer ffmpeg. (avconv support is deprecated)
ffmpeg_location: Location of the ffmpeg/avconv binary; either the path
to the binary or its containing directory.
postprocessor_args: A dictionary of postprocessor/executable keys (in lower case)
@@ -368,7 +390,8 @@ class YoutubeDL(object):
params = None
_ies = []
_pps = []
_pps = {'beforedl': [], 'aftermove': [], 'normal': []}
__prepare_filename_warned = False
_download_retcode = None
_num_downloads = None
_playlist_level = 0
@@ -381,7 +404,8 @@ class YoutubeDL(object):
params = {}
self._ies = []
self._ies_instances = {}
self._pps = []
self._pps = {'beforedl': [], 'aftermove': [], 'normal': []}
self.__prepare_filename_warned = False
self._post_hooks = []
self._progress_hooks = []
self._download_retcode = 0
@@ -427,6 +451,14 @@ class YoutubeDL(object):
if self.params.get('geo_verification_proxy') is None:
self.params['geo_verification_proxy'] = self.params['cn_verification_proxy']
if self.params.get('final_ext'):
if self.params.get('merge_output_format'):
self.report_warning('--merge-output-format will be ignored since --remux-video or --recode-video is given')
self.params['merge_output_format'] = self.params['final_ext']
if 'overwrites' in self.params and self.params['overwrites'] is None:
del self.params['overwrites']
check_deprecated('autonumber_size', '--autonumber-size', 'output template with %(autonumber)0Nd, where N in the number of digits')
check_deprecated('autonumber', '--auto-number', '-o "%(autonumber)s-%(title)s.%(ext)s"')
check_deprecated('usetitle', '--title', '-o "%(title)s-%(id)s.%(ext)s"')
@@ -468,10 +500,7 @@ class YoutubeDL(object):
'Set the LC_ALL environment variable to fix this.')
self.params['restrictfilenames'] = True
if isinstance(params.get('outtmpl'), bytes):
self.report_warning(
'Parameter outtmpl is bytes, but should be a unicode string. '
'Put from __future__ import unicode_literals at the top of your code file or consider switching to Python 3.x.')
self.outtmpl_dict = self.parse_outtmpl()
self._setup_opener()
@@ -483,8 +512,13 @@ class YoutubeDL(object):
pp_class = get_postprocessor(pp_def_raw['key'])
pp_def = dict(pp_def_raw)
del pp_def['key']
if 'when' in pp_def:
when = pp_def['when']
del pp_def['when']
else:
when = 'normal'
pp = pp_class(self, **compat_kwargs(pp_def))
self.add_post_processor(pp)
self.add_post_processor(pp, when=when)
for ph in self.params.get('post_hooks', []):
self.add_post_hook(ph)
@@ -536,9 +570,9 @@ class YoutubeDL(object):
for ie in gen_extractor_classes():
self.add_info_extractor(ie)
def add_post_processor(self, pp):
def add_post_processor(self, pp, when='normal'):
"""Add a PostProcessor object to the end of the chain."""
self._pps.append(pp)
self._pps[when].append(pp)
pp.set_downloader(self)
def add_post_hook(self, ph):
@@ -599,7 +633,7 @@ class YoutubeDL(object):
# already of type unicode()
ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
elif 'TERM' in os.environ:
self._write_string('\033[0;%s\007' % message, self._screen_file)
self._write_string('\033]0;%s\007' % message, self._screen_file)
def save_console_title(self):
if not self.params.get('consoletitle', False):
@@ -698,15 +732,33 @@ class YoutubeDL(object):
def report_file_delete(self, file_name):
"""Report that existing file will be deleted."""
try:
self.to_screen('Deleting already existent file %s' % file_name)
self.to_screen('Deleting existing file %s' % file_name)
except UnicodeEncodeError:
self.to_screen('Deleting already existent file')
self.to_screen('Deleting existing file')
def prepare_filename(self, info_dict):
"""Generate the output filename."""
def parse_outtmpl(self):
outtmpl_dict = self.params.get('outtmpl', {})
if not isinstance(outtmpl_dict, dict):
outtmpl_dict = {'default': outtmpl_dict}
outtmpl_dict.update({
k: v for k, v in DEFAULT_OUTTMPL.items()
if not outtmpl_dict.get(k)})
for key, val in outtmpl_dict.items():
if isinstance(val, bytes):
self.report_warning(
'Parameter outtmpl is bytes, but should be a unicode string. '
'Put from __future__ import unicode_literals at the top of your code file or consider switching to Python 3.x.')
return outtmpl_dict
def _prepare_filename(self, info_dict, tmpl_type='default'):
try:
template_dict = dict(info_dict)
template_dict['duration_string'] = ( # %(duration>%H-%M-%S)s is wrong if duration > 24hrs
formatSeconds(info_dict['duration'], '-')
if info_dict.get('duration', None) is not None
else None)
template_dict['epoch'] = int(time.time())
autonumber_size = self.params.get('autonumber_size')
if autonumber_size is None:
@@ -727,9 +779,11 @@ class YoutubeDL(object):
template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
for k, v in template_dict.items()
if v is not None and not isinstance(v, (list, tuple, dict)))
template_dict = collections.defaultdict(lambda: 'NA', template_dict)
na = self.params.get('outtmpl_na_placeholder', 'NA')
template_dict = collections.defaultdict(lambda: na, template_dict)
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
outtmpl = self.outtmpl_dict.get(tmpl_type, self.outtmpl_dict['default'])
force_ext = OUTTMPL_TYPES.get(tmpl_type)
# For fields playlist_index and autonumber convert all occurrences
# of %(field)s to %(field)0Nd for backward compatibility
@@ -745,27 +799,45 @@ class YoutubeDL(object):
r'%%(\1)0%dd' % field_size_compat_map[mobj.group('field')],
outtmpl)
# As of [1] format syntax is:
# %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
# 1. https://docs.python.org/2/library/stdtypes.html#string-formatting
FORMAT_RE = r'''(?x)
(?<!%)
%
\({0}\) # mapping key
(?:[#0\-+ ]+)? # conversion flags (optional)
(?:\d+)? # minimum field width (optional)
(?:\.\d+)? # precision (optional)
[hlL]? # length modifier (optional)
(?P<type>[diouxXeEfFgGcrs%]) # conversion type
'''
numeric_fields = list(self._NUMERIC_FIELDS)
# Format date
FORMAT_DATE_RE = FORMAT_RE.format(r'(?P<key>(?P<field>\w+)>(?P<format>.+?))')
for mobj in re.finditer(FORMAT_DATE_RE, outtmpl):
conv_type, field, frmt, key = mobj.group('type', 'field', 'format', 'key')
if key in template_dict:
continue
value = strftime_or_none(template_dict.get(field), frmt, na)
if conv_type in 'crs': # string
value = sanitize(field, value)
else: # number
numeric_fields.append(key)
value = float_or_none(value, default=None)
if value is not None:
template_dict[key] = value
# Missing numeric fields used together with integer presentation types
# in format specification will break the argument substitution since
# string 'NA' is returned for missing fields. We will patch output
# template for missing fields to meet string presentation type.
for numeric_field in self._NUMERIC_FIELDS:
# string NA placeholder is returned for missing fields. We will patch
# output template for missing fields to meet string presentation type.
for numeric_field in numeric_fields:
if numeric_field not in template_dict:
# As of [1] format syntax is:
# %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
# 1. https://docs.python.org/2/library/stdtypes.html#string-formatting
FORMAT_RE = r'''(?x)
(?<!%)
%
\({0}\) # mapping key
(?:[#0\-+ ]+)? # conversion flags (optional)
(?:\d+)? # minimum field width (optional)
(?:\.\d+)? # precision (optional)
[hlL]? # length modifier (optional)
[diouxXeEfFgGcrs%] # conversion type
'''
outtmpl = re.sub(
FORMAT_RE.format(numeric_field),
FORMAT_RE.format(re.escape(numeric_field)),
r'%({0})s'.format(numeric_field), outtmpl)
# expand_path translates '%%' into '%' and '$$' into '$'
@@ -781,6 +853,9 @@ class YoutubeDL(object):
# title "Hello $PATH", we don't want `$PATH` to be expanded.
filename = expand_path(outtmpl).replace(sep, '') % template_dict
if force_ext is not None:
filename = replace_extension(filename, force_ext, template_dict.get('ext'))
# https://github.com/blackjack4494/youtube-dlc/issues/85
trim_file_name = self.params.get('trim_file_name', False)
if trim_file_name:
@@ -796,11 +871,36 @@ class YoutubeDL(object):
# to workaround encoding issues with subprocess on python2 @ Windows
if sys.version_info < (3, 0) and sys.platform == 'win32':
filename = encodeFilename(filename, True).decode(preferredencoding())
return sanitize_path(filename)
filename = sanitize_path(filename)
return filename
except ValueError as err:
self.report_error('Error in output template: ' + str(err) + ' (encoding: ' + repr(preferredencoding()) + ')')
return None
def prepare_filename(self, info_dict, dir_type='', warn=False):
"""Generate the output filename."""
paths = self.params.get('paths', {})
assert isinstance(paths, dict)
filename = self._prepare_filename(info_dict, dir_type or 'default')
if warn and not self.__prepare_filename_warned:
if not paths:
pass
elif filename == '-':
self.report_warning('--paths is ignored when an outputting to stdout')
elif os.path.isabs(filename):
self.report_warning('--paths is ignored since an absolute path is given in output template')
self.__prepare_filename_warned = True
if filename == '-' or not filename:
return filename
homepath = expand_path(paths.get('home', '').strip())
assert isinstance(homepath, compat_str)
subdir = expand_path(paths.get(dir_type, '').strip()) if dir_type else ''
assert isinstance(subdir, compat_str)
return sanitize_path(os.path.join(homepath, subdir, filename))
def _match_entry(self, info_dict, incomplete):
""" Returns None if the file should be downloaded """
@@ -885,16 +985,16 @@ class YoutubeDL(object):
'and will probably not work.')
try:
temp_id = ie.extract_id(url) if callable(getattr(ie, 'extract_id', None)) else ie._match_id(url)
temp_id = str_or_none(
ie.extract_id(url) if callable(getattr(ie, 'extract_id', None))
else ie._match_id(url))
except (AssertionError, IndexError, AttributeError):
temp_id = None
if temp_id is not None and self.in_download_archive({'id': temp_id, 'ie_key': ie_key}):
self.to_screen("[%s] %s: has already been recorded in archive" % (
ie_key, temp_id))
break
return self.__extract_info(url, ie, download, extra_info, process, info_dict)
else:
self.report_error('no suitable InfoExtractor for URL %s' % url)
@@ -946,10 +1046,6 @@ class YoutubeDL(object):
self.add_extra_info(ie_result, {
'extractor': ie.IE_NAME,
'webpage_url': url,
'duration_string': (
formatSeconds(ie_result['duration'], '-')
if ie_result.get('duration', None) is not None
else None),
'webpage_url_basename': url_basename(url),
'extractor_key': ie.ie_key(),
})
@@ -969,9 +1065,7 @@ class YoutubeDL(object):
extract_flat = self.params.get('extract_flat', False)
if ((extract_flat == 'in_playlist' and 'playlist' in extra_info)
or extract_flat is True):
self.__forced_printings(
ie_result, self.prepare_filename(ie_result),
incomplete=True)
self.__forced_printings(ie_result, self.prepare_filename(ie_result), incomplete=True)
return ie_result
if result_type == 'video':
@@ -1062,6 +1156,53 @@ class YoutubeDL(object):
playlist = ie_result.get('title') or ie_result.get('id')
self.to_screen('[download] Downloading playlist: %s' % playlist)
if self.params.get('allow_playlist_files', True):
ie_copy = {
'playlist': playlist,
'playlist_id': ie_result.get('id'),
'playlist_title': ie_result.get('title'),
'playlist_uploader': ie_result.get('uploader'),
'playlist_uploader_id': ie_result.get('uploader_id'),
'playlist_index': 0
}
ie_copy.update(dict(ie_result))
def ensure_dir_exists(path):
return make_dir(path, self.report_error)
if self.params.get('writeinfojson', False):
infofn = self.prepare_filename(ie_copy, 'pl_infojson')
if not ensure_dir_exists(encodeFilename(infofn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(infofn)):
self.to_screen('[info] Playlist metadata is already present')
else:
playlist_info = dict(ie_result)
# playlist_info['entries'] = list(playlist_info['entries']) # Entries is a generator which shouldnot be resolved here
del playlist_info['entries']
self.to_screen('[info] Writing playlist metadata as JSON to: ' + infofn)
try:
write_json_file(self.filter_requested_info(playlist_info), infofn)
except (OSError, IOError):
self.report_error('Cannot write playlist metadata to JSON file ' + infofn)
if self.params.get('writedescription', False):
descfn = self.prepare_filename(ie_copy, 'pl_description')
if not ensure_dir_exists(encodeFilename(descfn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(descfn)):
self.to_screen('[info] Playlist description is already present')
elif ie_result.get('description') is None:
self.report_warning('There\'s no playlist description to write.')
else:
try:
self.to_screen('[info] Writing playlist description to: ' + descfn)
with io.open(encodeFilename(descfn), 'w', encoding='utf-8') as descfile:
descfile.write(ie_result['description'])
except (OSError, IOError):
self.report_error('Cannot write playlist description file ' + descfn)
return
playlist_results = []
playliststart = self.params.get('playliststart', 1) - 1
@@ -1246,7 +1387,7 @@ class YoutubeDL(object):
and (
not can_merge()
or info_dict.get('is_live', False)
or self.params.get('outtmpl', DEFAULT_OUTTMPL) == '-'))
or self.outtmpl_dict['default'] == '-'))
return (
'best/bestvideo+bestaudio'
@@ -1757,7 +1898,7 @@ class YoutubeDL(object):
if req_format is None:
req_format = self._default_format_spec(info_dict, download=download)
if self.params.get('verbose'):
self._write_string('[debug] Default format spec: %s\n' % req_format)
self.to_screen('[debug] Default format spec: %s' % req_format)
format_selector = self.build_format_selector(req_format)
@@ -1888,6 +2029,8 @@ class YoutubeDL(object):
assert info_dict.get('_type', 'video') == 'video'
info_dict.setdefault('__postprocessors', [])
max_downloads = self.params.get('max_downloads')
if max_downloads is not None:
if self._num_downloads >= int(max_downloads):
@@ -1904,10 +2047,15 @@ class YoutubeDL(object):
self._num_downloads += 1
info_dict['_filename'] = filename = self.prepare_filename(info_dict)
info_dict = self.pre_process(info_dict)
info_dict['_filename'] = full_filename = self.prepare_filename(info_dict, warn=True)
temp_filename = self.prepare_filename(info_dict, 'temp')
files_to_move = {}
skip_dl = self.params.get('skip_download', False)
# Forced printings
self.__forced_printings(info_dict, filename, incomplete=False)
self.__forced_printings(info_dict, full_filename, incomplete=False)
if self.params.get('simulate', False):
if self.params.get('force_write_download_archive', False):
@@ -1916,24 +2064,21 @@ class YoutubeDL(object):
# Do nothing else if in simulate mode
return
if filename is None:
if full_filename is None:
return
def ensure_dir_exists(path):
try:
dn = os.path.dirname(path)
if dn and not os.path.exists(dn):
os.makedirs(dn)
return True
except (OSError, IOError) as err:
self.report_error('unable to create directory ' + error_to_compat_str(err))
return False
return make_dir(path, self.report_error)
if not ensure_dir_exists(sanitize_path(encodeFilename(filename))):
if not ensure_dir_exists(encodeFilename(full_filename)):
return
if not ensure_dir_exists(encodeFilename(temp_filename)):
return
if self.params.get('writedescription', False):
descfn = replace_extension(filename, 'description', info_dict.get('ext'))
descfn = self.prepare_filename(info_dict, 'description')
if not ensure_dir_exists(encodeFilename(descfn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(descfn)):
self.to_screen('[info] Video description is already present')
elif info_dict.get('description') is None:
@@ -1948,7 +2093,9 @@ class YoutubeDL(object):
return
if self.params.get('writeannotations', False):
annofn = replace_extension(filename, 'annotations.xml', info_dict.get('ext'))
annofn = self.prepare_filename(info_dict, 'annotation')
if not ensure_dir_exists(encodeFilename(annofn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(annofn)):
self.to_screen('[info] Video annotations are already present')
elif not info_dict.get('annotations'):
@@ -1982,9 +2129,14 @@ class YoutubeDL(object):
# ie = self.get_info_extractor(info_dict['extractor_key'])
for sub_lang, sub_info in subtitles.items():
sub_format = sub_info['ext']
sub_filename = subtitles_filename(filename, sub_lang, sub_format, info_dict.get('ext'))
sub_fn = self.prepare_filename(info_dict, 'subtitle')
sub_filename = subtitles_filename(
temp_filename if not skip_dl else sub_fn,
sub_lang, sub_format, info_dict.get('ext'))
sub_filename_final = subtitles_filename(sub_fn, sub_lang, sub_format, info_dict.get('ext'))
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(sub_filename)):
self.to_screen('[info] Video subtitle %s.%s is already present' % (sub_lang, sub_format))
files_to_move[sub_filename] = sub_filename_final
else:
self.to_screen('[info] Writing video subtitles to: ' + sub_filename)
if sub_info.get('data') is not None:
@@ -1993,6 +2145,7 @@ class YoutubeDL(object):
# See https://github.com/ytdl-org/youtube-dl/issues/10268
with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8', newline='') as subfile:
subfile.write(sub_info['data'])
files_to_move[sub_filename] = sub_filename_final
except (OSError, IOError):
self.report_error('Cannot write subtitles file ' + sub_filename)
return
@@ -2008,47 +2161,55 @@ class YoutubeDL(object):
with io.open(encodeFilename(sub_filename), 'wb') as subfile:
subfile.write(sub_data)
'''
files_to_move[sub_filename] = sub_filename_final
except (ExtractorError, IOError, OSError, ValueError, compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
self.report_warning('Unable to download subtitle for "%s": %s' %
(sub_lang, error_to_compat_str(err)))
continue
if self.params.get('skip_download', False):
if skip_dl:
if self.params.get('convertsubtitles', False):
subconv = FFmpegSubtitlesConvertorPP(self, format=self.params.get('convertsubtitles'))
filename_real_ext = os.path.splitext(filename)[1][1:]
# subconv = FFmpegSubtitlesConvertorPP(self, format=self.params.get('convertsubtitles'))
filename_real_ext = os.path.splitext(full_filename)[1][1:]
filename_wo_ext = (
os.path.splitext(filename)[0]
os.path.splitext(full_filename)[0]
if filename_real_ext == info_dict['ext']
else filename)
else full_filename)
afilename = '%s.%s' % (filename_wo_ext, self.params.get('convertsubtitles'))
if subconv.available:
info_dict.setdefault('__postprocessors', [])
# info_dict['__postprocessors'].append(subconv)
# if subconv.available:
# info_dict['__postprocessors'].append(subconv)
if os.path.exists(encodeFilename(afilename)):
self.to_screen(
'[download] %s has already been downloaded and '
'converted' % afilename)
else:
try:
self.post_process(filename, info_dict)
except (PostProcessingError) as err:
self.report_error('postprocessing: %s' % str(err))
self.post_process(full_filename, info_dict, files_to_move)
except PostProcessingError as err:
self.report_error('Postprocessing: %s' % str(err))
return
if self.params.get('writeinfojson', False):
infofn = replace_extension(filename, 'info.json', info_dict.get('ext'))
infofn = self.prepare_filename(info_dict, 'infojson')
if not ensure_dir_exists(encodeFilename(infofn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(infofn)):
self.to_screen('[info] Video description metadata is already present')
self.to_screen('[info] Video metadata is already present')
else:
self.to_screen('[info] Writing video description metadata as JSON to: ' + infofn)
self.to_screen('[info] Writing video metadata as JSON to: ' + infofn)
try:
write_json_file(self.filter_requested_info(info_dict), infofn)
except (OSError, IOError):
self.report_error('Cannot write metadata to JSON file ' + infofn)
self.report_error('Cannot write video metadata to JSON file ' + infofn)
return
info_dict['__infojson_filename'] = infofn
self._write_thumbnails(info_dict, filename)
thumbfn = self.prepare_filename(info_dict, 'thumbnail')
thumb_fn_temp = temp_filename if not skip_dl else thumbfn
for thumb_ext in self._write_thumbnails(info_dict, thumb_fn_temp):
thumb_filename_temp = replace_extension(thumb_fn_temp, thumb_ext, info_dict.get('ext'))
thumb_filename = replace_extension(thumbfn, thumb_ext, info_dict.get('ext'))
files_to_move[thumb_filename_temp] = info_dict['__thumbnail_filename'] = thumb_filename
# Write internet shortcut files
url_link = webloc_link = desktop_link = False
@@ -2073,8 +2234,8 @@ class YoutubeDL(object):
ascii_url = iri_to_uri(info_dict['webpage_url'])
def _write_link_file(extension, template, newline, embed_filename):
linkfn = replace_extension(filename, extension, info_dict.get('ext'))
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(linkfn)):
linkfn = replace_extension(full_filename, extension, info_dict.get('ext'))
if self.params.get('overwrites', True) and os.path.exists(encodeFilename(linkfn)):
self.to_screen('[info] Internet shortcut is already present')
else:
try:
@@ -2101,16 +2262,39 @@ class YoutubeDL(object):
# Download
must_record_download_archive = False
if not self.params.get('skip_download', False):
if not skip_dl:
try:
def existing_file(*filepaths):
ext = info_dict.get('ext')
final_ext = self.params.get('final_ext', ext)
existing_files = []
for file in orderedSet(filepaths):
if final_ext != ext:
converted = replace_extension(file, final_ext, ext)
if os.path.exists(encodeFilename(converted)):
existing_files.append(converted)
if os.path.exists(encodeFilename(file)):
existing_files.append(file)
if not existing_files or self.params.get('overwrites', False):
for file in orderedSet(existing_files):
self.report_file_delete(file)
os.remove(encodeFilename(file))
return None
self.report_file_already_downloaded(existing_files[0])
info_dict['ext'] = os.path.splitext(existing_files[0])[1][1:]
return existing_files[0]
success = True
if info_dict.get('requested_formats') is not None:
downloaded = []
success = True
merger = FFmpegMergerPP(self)
if not merger.available:
postprocessors = []
self.report_warning('You have requested multiple '
'formats but ffmpeg or avconv are not installed.'
'formats but ffmpeg is not installed.'
' The formats won\'t be merged.')
else:
postprocessors = [merger]
@@ -2134,32 +2318,31 @@ class YoutubeDL(object):
# TODO: Check acodec/vcodec
return False
filename_real_ext = os.path.splitext(filename)[1][1:]
filename_wo_ext = (
os.path.splitext(filename)[0]
if filename_real_ext == info_dict['ext']
else filename)
requested_formats = info_dict['requested_formats']
old_ext = info_dict['ext']
if self.params.get('merge_output_format') is None and not compatible_formats(requested_formats):
info_dict['ext'] = 'mkv'
self.report_warning(
'Requested formats are incompatible for merge and will be merged into mkv.')
def correct_ext(filename):
filename_real_ext = os.path.splitext(filename)[1][1:]
filename_wo_ext = (
os.path.splitext(filename)[0]
if filename_real_ext == old_ext
else filename)
return '%s.%s' % (filename_wo_ext, info_dict['ext'])
# Ensure filename always has a correct extension for successful merge
filename = '%s.%s' % (filename_wo_ext, info_dict['ext'])
file_exists = os.path.exists(encodeFilename(filename))
if not self.params.get('overwrites', False) and file_exists:
self.to_screen(
'[download] %s has already been downloaded and '
'merged' % filename)
else:
if file_exists:
self.report_file_delete(filename)
os.remove(encodeFilename(filename))
full_filename = correct_ext(full_filename)
temp_filename = correct_ext(temp_filename)
dl_filename = existing_file(full_filename, temp_filename)
if dl_filename is None:
for f in requested_formats:
new_info = dict(info_dict)
new_info.update(f)
fname = prepend_extension(
self.prepare_filename(new_info),
self.prepare_filename(new_info, 'temp'),
'f%s' % f['format_id'], new_info['ext'])
if not ensure_dir_exists(fname):
return
@@ -2171,14 +2354,15 @@ class YoutubeDL(object):
# Even if there were no downloads, it is being merged only now
info_dict['__real_download'] = True
else:
# Delete existing file with --yes-overwrites
if self.params.get('overwrites', False):
if os.path.exists(encodeFilename(filename)):
self.report_file_delete(filename)
os.remove(encodeFilename(filename))
# Just a single file
success, real_download = dl(filename, info_dict)
info_dict['__real_download'] = real_download
dl_filename = existing_file(full_filename, temp_filename)
if dl_filename is None:
success, real_download = dl(temp_filename, info_dict)
info_dict['__real_download'] = real_download
dl_filename = dl_filename or temp_filename
info_dict['__finaldir'] = os.path.dirname(os.path.abspath(encodeFilename(full_filename)))
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
self.report_error('unable to download video data: %s' % error_to_compat_str(err))
return
@@ -2188,13 +2372,13 @@ class YoutubeDL(object):
self.report_error('content too short (expected %s bytes and served %s)' % (err.expected, err.downloaded))
return
if success and filename != '-':
if success and full_filename != '-':
# Fixup content
fixup_policy = self.params.get('fixup')
if fixup_policy is None:
fixup_policy = 'detect_or_warn'
INSTALL_FFMPEG_MESSAGE = 'Install ffmpeg or avconv to fix this automatically.'
INSTALL_FFMPEG_MESSAGE = 'Install ffmpeg to fix this automatically.'
stretched_ratio = info_dict.get('stretched_ratio')
if stretched_ratio is not None and stretched_ratio != 1:
@@ -2204,7 +2388,6 @@ class YoutubeDL(object):
elif fixup_policy == 'detect_or_warn':
stretched_pp = FFmpegFixupStretchedPP(self)
if stretched_pp.available:
info_dict.setdefault('__postprocessors', [])
info_dict['__postprocessors'].append(stretched_pp)
else:
self.report_warning(
@@ -2214,7 +2397,8 @@ class YoutubeDL(object):
assert fixup_policy in ('ignore', 'never')
if (info_dict.get('requested_formats') is None
and info_dict.get('container') == 'm4a_dash'):
and info_dict.get('container') == 'm4a_dash'
and info_dict.get('ext') == 'm4a'):
if fixup_policy == 'warn':
self.report_warning(
'%s: writing DASH m4a. '
@@ -2223,7 +2407,6 @@ class YoutubeDL(object):
elif fixup_policy == 'detect_or_warn':
fixup_pp = FFmpegFixupM4aPP(self)
if fixup_pp.available:
info_dict.setdefault('__postprocessors', [])
info_dict['__postprocessors'].append(fixup_pp)
else:
self.report_warning(
@@ -2242,7 +2425,6 @@ class YoutubeDL(object):
elif fixup_policy == 'detect_or_warn':
fixup_pp = FFmpegFixupM3u8PP(self)
if fixup_pp.available:
info_dict.setdefault('__postprocessors', [])
info_dict['__postprocessors'].append(fixup_pp)
else:
self.report_warning(
@@ -2252,13 +2434,13 @@ class YoutubeDL(object):
assert fixup_policy in ('ignore', 'never')
try:
self.post_process(filename, info_dict)
except (PostProcessingError) as err:
self.report_error('postprocessing: %s' % str(err))
self.post_process(dl_filename, info_dict, files_to_move)
except PostProcessingError as err:
self.report_error('Postprocessing: %s' % str(err))
return
try:
for ph in self._post_hooks:
ph(filename)
ph(full_filename)
except Exception as err:
self.report_error('post hooks: %s' % str(err))
return
@@ -2272,7 +2454,7 @@ class YoutubeDL(object):
def download(self, url_list):
"""Download a given list of URLs."""
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
outtmpl = self.outtmpl_dict['default']
if (len(url_list) > 1
and outtmpl != '-'
and '%' not in outtmpl
@@ -2320,31 +2502,48 @@ class YoutubeDL(object):
@staticmethod
def filter_requested_info(info_dict):
fields_to_remove = ('requested_formats', 'requested_subtitles')
return dict(
(k, v) for k, v in info_dict.items()
if k not in ['requested_formats', 'requested_subtitles'])
if (k[0] != '_' or k == '_type') and k not in fields_to_remove)
def post_process(self, filename, ie_info):
def run_pp(self, pp, infodict, files_to_move={}):
files_to_delete = []
files_to_delete, infodict = pp.run(infodict)
if not files_to_delete:
return files_to_move, infodict
if self.params.get('keepvideo', False):
for f in files_to_delete:
files_to_move.setdefault(f, '')
else:
for old_filename in set(files_to_delete):
self.to_screen('Deleting original file %s (pass -k to keep)' % old_filename)
try:
os.remove(encodeFilename(old_filename))
except (IOError, OSError):
self.report_warning('Unable to remove downloaded original file')
if old_filename in files_to_move:
del files_to_move[old_filename]
return files_to_move, infodict
def pre_process(self, ie_info):
info = dict(ie_info)
for pp in self._pps['beforedl']:
info = self.run_pp(pp, info)[1]
return info
def post_process(self, filename, ie_info, files_to_move={}):
"""Run all the postprocessors on the given file."""
info = dict(ie_info)
info['filepath'] = filename
pps_chain = []
if ie_info.get('__postprocessors') is not None:
pps_chain.extend(ie_info['__postprocessors'])
pps_chain.extend(self._pps)
for pp in pps_chain:
files_to_delete = []
try:
files_to_delete, info = pp.run(info)
except PostProcessingError as e:
self.report_error(e.msg)
if files_to_delete and not self.params.get('keepvideo', False):
for old_filename in set(files_to_delete):
self.to_screen('Deleting original file %s (pass -k to keep)' % old_filename)
try:
os.remove(encodeFilename(old_filename))
except (IOError, OSError):
self.report_warning('Unable to remove downloaded original file')
info['__files_to_move'] = {}
for pp in ie_info.get('__postprocessors', []) + self._pps['normal']:
files_to_move, info = self.run_pp(pp, info, files_to_move)
info = self.run_pp(MoveFilesAfterDownloadPP(self, files_to_move), info)[1]
for pp in self._pps['aftermove']:
info = self.run_pp(pp, info, {})[1]
def _make_archive_id(self, info_dict):
video_id = info_dict.get('id')
@@ -2364,7 +2563,7 @@ class YoutubeDL(object):
break
else:
return
return extractor.lower() + ' ' + video_id
return '%s %s' % (extractor.lower(), video_id)
def in_download_archive(self, info_dict):
fn = self.params.get('download_archive')
@@ -2565,9 +2764,12 @@ class YoutubeDL(object):
self.get_encoding()))
write_string(encoding_str, encoding=None)
self._write_string('[debug] yt-dlp version ' + __version__ + '\n')
self._write_string('[debug] yt-dlp version %s\n' % __version__)
if _LAZY_LOADER:
self._write_string('[debug] Lazy loading extractors enabled' + '\n')
self._write_string('[debug] Lazy loading extractors enabled\n')
if _PLUGIN_CLASSES:
self._write_string(
'[debug] Plugin Extractors: %s\n' % [ie.ie_key() for ie in _PLUGIN_CLASSES])
try:
sp = subprocess.Popen(
['git', 'rev-parse', '--short', 'HEAD'],
@@ -2576,7 +2778,7 @@ class YoutubeDL(object):
out, err = process_communicate_or_kill(sp)
out = out.decode().strip()
if re.match('[0-9a-f]+', out):
self._write_string('[debug] Git HEAD: ' + out + '\n')
self._write_string('[debug] Git HEAD: %s\n' % out)
except Exception:
try:
sys.exc_clear()
@@ -2692,27 +2894,25 @@ class YoutubeDL(object):
encoding = preferredencoding()
return encoding
def _write_thumbnails(self, info_dict, filename):
def _write_thumbnails(self, info_dict, filename): # return the extensions
if self.params.get('writethumbnail', False):
thumbnails = info_dict.get('thumbnails')
if thumbnails:
thumbnails = [thumbnails[-1]]
elif self.params.get('write_all_thumbnails', False):
thumbnails = info_dict.get('thumbnails')
thumbnails = info_dict.get('thumbnails') or []
else:
return
if not thumbnails:
# No thumbnails present, so return immediately
return
thumbnails = []
ret = []
for t in thumbnails:
thumb_ext = determine_ext(t['url'], 'jpg')
suffix = '_%s' % t['id'] if len(thumbnails) > 1 else ''
suffix = '%s.' % t['id'] if len(thumbnails) > 1 else ''
thumb_display_id = '%s ' % t['id'] if len(thumbnails) > 1 else ''
t['filename'] = thumb_filename = replace_extension(filename + suffix, thumb_ext, info_dict.get('ext'))
t['filename'] = thumb_filename = replace_extension(filename, suffix + thumb_ext, info_dict.get('ext'))
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(thumb_filename)):
ret.append(suffix + thumb_ext)
self.to_screen('[%s] %s: Thumbnail %sis already present' %
(info_dict['extractor'], info_dict['id'], thumb_display_id))
else:
@@ -2722,8 +2922,10 @@ class YoutubeDL(object):
uf = self.urlopen(t['url'])
with open(encodeFilename(thumb_filename), 'wb') as thumbf:
shutil.copyfileobj(uf, thumbf)
ret.append(suffix + thumb_ext)
self.to_screen('[%s] %s: Writing thumbnail %sto: %s' %
(info_dict['extractor'], info_dict['id'], thumb_display_id, thumb_filename))
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
self.report_warning('Unable to download thumbnail "%s": %s' %
(t['url'], error_to_compat_str(err)))
return ret

View File

@@ -15,10 +15,10 @@ import sys
from .options import (
parseOpts,
_remux_formats,
)
from .compat import (
compat_getpass,
compat_shlex_split,
workaround_optparse_bug9161,
)
from .utils import (
@@ -46,6 +46,7 @@ from .downloader import (
from .extractor import gen_extractors, list_extractors
from .extractor.common import InfoExtractor
from .extractor.adobepass import MSO_INFO
from .postprocessor.metadatafromfield import MetadataFromFieldPP
from .YoutubeDL import YoutubeDL
@@ -70,14 +71,7 @@ def _real_main(argv=None):
std_headers['Referer'] = opts.referer
# Custom HTTP headers
if opts.headers is not None:
for h in opts.headers:
if ':' not in h:
parser.error('wrong header formatting, it should be key:value, not "%s"' % h)
key, value = h.split(':', 1)
if opts.verbose:
write_string('[debug] Adding header from command line option %s:%s\n' % (key, value))
std_headers[key] = value
std_headers.update(opts.headers)
# Dump user agent
if opts.dump_user_agent:
@@ -216,12 +210,15 @@ def _real_main(argv=None):
opts.audioquality = opts.audioquality.strip('k').strip('K')
if not opts.audioquality.isdigit():
parser.error('invalid audio quality specified')
if opts.remuxvideo is not None:
if opts.remuxvideo not in ['mp4', 'mkv']:
parser.error('invalid video container format specified')
if opts.recodevideo is not None:
if opts.recodevideo not in ['mp4', 'flv', 'webm', 'ogg', 'mkv', 'avi']:
if opts.recodevideo not in _remux_formats:
parser.error('invalid video recode format specified')
if opts.remuxvideo and opts.recodevideo:
opts.remuxvideo = None
write_string('WARNING: --remux-video is ignored since --recode-video was given\n', out=sys.stderr)
if opts.remuxvideo is not None:
if opts.remuxvideo not in _remux_formats:
parser.error('invalid video remux format specified')
if opts.convertsubtitles is not None:
if opts.convertsubtitles not in ['srt', 'vtt', 'ass', 'lrc']:
parser.error('invalid subtitle format specified')
@@ -240,32 +237,45 @@ def _real_main(argv=None):
if opts.allsubtitles and not opts.writeautomaticsub:
opts.writesubtitles = True
outtmpl = ((opts.outtmpl is not None and opts.outtmpl)
or (opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s')
or (opts.format == '-1' and '%(id)s-%(format)s.%(ext)s')
or (opts.usetitle and opts.autonumber and '%(autonumber)s-%(title)s-%(id)s.%(ext)s')
or (opts.usetitle and '%(title)s-%(id)s.%(ext)s')
or (opts.useid and '%(id)s.%(ext)s')
or (opts.autonumber and '%(autonumber)s-%(id)s.%(ext)s')
or DEFAULT_OUTTMPL)
if not os.path.splitext(outtmpl)[1] and opts.extractaudio:
outtmpl = opts.outtmpl
if not outtmpl:
outtmpl = {'default': (
'%(title)s-%(id)s-%(format)s.%(ext)s' if opts.format == '-1' and opts.usetitle
else '%(id)s-%(format)s.%(ext)s' if opts.format == '-1'
else '%(autonumber)s-%(title)s-%(id)s.%(ext)s' if opts.usetitle and opts.autonumber
else '%(title)s-%(id)s.%(ext)s' if opts.usetitle
else '%(id)s.%(ext)s' if opts.useid
else '%(autonumber)s-%(id)s.%(ext)s' if opts.autonumber
else None)}
outtmpl_default = outtmpl.get('default')
if outtmpl_default is not None and not os.path.splitext(outtmpl_default)[1] and opts.extractaudio:
parser.error('Cannot download a video and extract audio into the same'
' file! Use "{0}.%(ext)s" instead of "{0}" as the output'
' template'.format(outtmpl))
' template'.format(outtmpl_default))
for f in opts.format_sort:
if re.match(InfoExtractor.FormatSort.regex, f) is None:
parser.error('invalid format sort string "%s" specified' % f)
if opts.metafromfield is None:
opts.metafromfield = []
if opts.metafromtitle is not None:
opts.metafromfield.append('title:%s' % opts.metafromtitle)
for f in opts.metafromfield:
if re.match(MetadataFromFieldPP.regex, f) is None:
parser.error('invalid format string "%s" specified for --parse-metadata' % f)
any_getting = opts.geturl or opts.gettitle or opts.getid or opts.getthumbnail or opts.getdescription or opts.getfilename or opts.getformat or opts.getduration or opts.dumpjson or opts.dump_single_json
any_printing = opts.print_json
download_archive_fn = expand_path(opts.download_archive) if opts.download_archive is not None else opts.download_archive
# PostProcessors
postprocessors = []
if opts.metafromtitle:
if opts.metafromfield:
postprocessors.append({
'key': 'MetadataFromTitle',
'titleformat': opts.metafromtitle
'key': 'MetadataFromField',
'formats': opts.metafromfield,
'when': 'beforedl'
})
if opts.extractaudio:
postprocessors.append({
@@ -326,32 +336,24 @@ def _real_main(argv=None):
'force': opts.sponskrub_force,
'ignoreerror': opts.sponskrub is None,
})
# Please keep ExecAfterDownload towards the bottom as it allows the user to modify the final file in any way.
# So if the user is able to remove the file before your postprocessor runs it might cause a few problems.
# ExecAfterDownload must be the last PP
if opts.exec_cmd:
postprocessors.append({
'key': 'ExecAfterDownload',
'exec_cmd': opts.exec_cmd,
'when': 'aftermove'
})
external_downloader_args = None
if opts.external_downloader_args:
external_downloader_args = compat_shlex_split(opts.external_downloader_args)
postprocessor_args = {}
if opts.postprocessor_args is not None:
for string in opts.postprocessor_args:
mobj = re.match(r'(?P<pp>\w+(?:\+\w+)?):(?P<args>.*)$', string)
if mobj is None:
if 'sponskrub' not in postprocessor_args: # for backward compatibility
postprocessor_args['sponskrub'] = []
if opts.verbose:
write_string('[debug] Adding postprocessor args from command line option sponskrub: \n')
pp_key, pp_args = 'default', string
else:
pp_key, pp_args = mobj.group('pp').lower(), mobj.group('args')
if opts.verbose:
write_string('[debug] Adding postprocessor args from command line option %s: %s\n' % (pp_key, pp_args))
postprocessor_args[pp_key] = compat_shlex_split(pp_args)
_args_compat_warning = 'WARNING: %s given without specifying name. The arguments will be given to all %s\n'
if 'default' in opts.external_downloader_args:
write_string(_args_compat_warning % ('--external-downloader-args', 'external downloaders'), out=sys.stderr),
if 'default-compat' in opts.postprocessor_args and 'default' not in opts.postprocessor_args:
write_string(_args_compat_warning % ('--post-processor-args', 'post-processors'), out=sys.stderr),
opts.postprocessor_args.setdefault('sponskrub', [])
opts.postprocessor_args['default'] = opts.postprocessor_args['default-compat']
audio_ext = opts.audioformat if (opts.extractaudio and opts.audioformat != 'best') else None
match_filter = (
None if opts.match_filter is None
@@ -390,6 +392,8 @@ def _real_main(argv=None):
'listformats': opts.listformats,
'listformats_table': opts.listformats_table,
'outtmpl': outtmpl,
'outtmpl_na_placeholder': opts.outtmpl_na_placeholder,
'paths': opts.paths,
'autonumber_size': opts.autonumber_size,
'autonumber_start': opts.autonumber_start,
'restrictfilenames': opts.restrictfilenames,
@@ -412,13 +416,14 @@ def _real_main(argv=None):
'playlistreverse': opts.playlist_reverse,
'playlistrandom': opts.playlist_random,
'noplaylist': opts.noplaylist,
'logtostderr': opts.outtmpl == '-',
'logtostderr': outtmpl_default == '-',
'consoletitle': opts.consoletitle,
'nopart': opts.nopart,
'updatetime': opts.updatetime,
'writedescription': opts.writedescription,
'writeannotations': opts.writeannotations,
'writeinfojson': opts.writeinfojson,
'writeinfojson': opts.writeinfojson or opts.getcomments,
'getcomments': opts.getcomments,
'writethumbnail': opts.writethumbnail,
'write_all_thumbnails': opts.write_all_thumbnails,
'writelink': opts.writelink,
@@ -469,6 +474,7 @@ def _real_main(argv=None):
'extract_flat': opts.extract_flat,
'mark_watched': opts.mark_watched,
'merge_output_format': opts.merge_output_format,
'final_ext': opts.recodevideo or opts.remuxvideo or audio_ext,
'postprocessors': postprocessors,
'fixup': opts.fixup,
'source_address': opts.source_address,
@@ -485,8 +491,8 @@ def _real_main(argv=None):
'ffmpeg_location': opts.ffmpeg_location,
'hls_prefer_native': opts.hls_prefer_native,
'hls_use_mpegts': opts.hls_use_mpegts,
'external_downloader_args': external_downloader_args,
'postprocessor_args': postprocessor_args,
'external_downloader_args': opts.external_downloader_args,
'postprocessor_args': opts.postprocessor_args,
'cn_verification_proxy': opts.cn_verification_proxy,
'geo_verification_proxy': opts.geo_verification_proxy,
'config_location': opts.config_location,

View File

@@ -332,7 +332,7 @@ class FileDownloader(object):
"""
nooverwrites_and_exists = (
not self.params.get('overwrites', True)
not self.params.get('overwrites', subtitle)
and os.path.exists(encodeFilename(filename))
)

View File

@@ -95,7 +95,8 @@ class ExternalFD(FileDownloader):
return cli_valueless_option(self.params, command_option, param, expected_value)
def _configuration_args(self, default=[]):
return cli_configuration_args(self.params, 'external_downloader_args', default)
return cli_configuration_args(
self.params, 'external_downloader_args', self.get_basename(), default)[0]
def _call_downloader(self, tmpfilename, info_dict):
""" Either overwrite this or implement _make_cmd """
@@ -232,7 +233,7 @@ class FFmpegFD(ExternalFD):
url = info_dict['url']
ffpp = FFmpegPostProcessor(downloader=self)
if not ffpp.available:
self.report_error('m3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
self.report_error('m3u8 download detected but ffmpeg could not be found. Please install')
return False
ffpp.check_version()

View File

@@ -4,6 +4,9 @@ import re
import json
from .fragment import FragmentFD
from ..compat import compat_urllib_error
from ..utils import try_get
from ..extractor.youtube import YoutubeBaseInfoExtractor as YT_BaseIE
class YoutubeLiveChatReplayFD(FragmentFD):
@@ -15,6 +18,7 @@ class YoutubeLiveChatReplayFD(FragmentFD):
video_id = info_dict['video_id']
self.to_screen('[%s] Downloading live chat' % self.FD_NAME)
fragment_retries = self.params.get('fragment_retries', 0)
test = self.params.get('test', False)
ctx = {
@@ -28,15 +32,52 @@ class YoutubeLiveChatReplayFD(FragmentFD):
return self._download_fragment(ctx, url, info_dict, headers)
def parse_yt_initial_data(data):
window_patt = b'window\\["ytInitialData"\\]\\s*=\\s*(.*?)(?<=});'
var_patt = b'var\\s+ytInitialData\\s*=\\s*(.*?)(?<=});'
for patt in window_patt, var_patt:
patterns = (
r'%s\\s*%s' % (YT_BaseIE._YT_INITIAL_DATA_RE, YT_BaseIE._YT_INITIAL_BOUNDARY_RE),
r'%s' % YT_BaseIE._YT_INITIAL_DATA_RE)
data = data.decode('utf-8', 'replace')
for patt in patterns:
try:
raw_json = re.search(patt, data).group(1)
return json.loads(raw_json)
except AttributeError:
continue
def download_and_parse_fragment(url, frag_index):
count = 0
while count <= fragment_retries:
try:
success, raw_fragment = dl_fragment(url)
if not success:
return False, None, None
data = parse_yt_initial_data(raw_fragment) or json.loads(raw_fragment)['response']
live_chat_continuation = try_get(
data,
lambda x: x['continuationContents']['liveChatContinuation'], dict) or {}
offset = continuation_id = None
processed_fragment = bytearray()
for action in live_chat_continuation.get('actions', []):
if 'replayChatItemAction' in action:
replay_chat_item_action = action['replayChatItemAction']
offset = int(replay_chat_item_action['videoOffsetTimeMsec'])
processed_fragment.extend(
json.dumps(action, ensure_ascii=False).encode('utf-8') + b'\n')
if offset is not None:
continuation_id = try_get(
live_chat_continuation,
lambda x: x['continuations'][0]['liveChatReplayContinuationData']['continuation'])
self._append_fragment(ctx, processed_fragment)
return True, continuation_id, offset
except compat_urllib_error.HTTPError as err:
count += 1
if count <= fragment_retries:
self.report_retry_fragment(err, frag_index, count, fragment_retries)
if count > fragment_retries:
self.report_error('giving up after %s fragment retries' % fragment_retries)
return False, None, None
self._prepare_and_start_frag_download(ctx)
success, raw_fragment = dl_fragment(
@@ -44,54 +85,25 @@ class YoutubeLiveChatReplayFD(FragmentFD):
if not success:
return False
data = parse_yt_initial_data(raw_fragment)
continuation_id = data['contents']['twoColumnWatchNextResults']['conversationBar']['liveChatRenderer']['continuations'][0]['reloadContinuationData']['continuation']
continuation_id = try_get(
data,
lambda x: x['contents']['twoColumnWatchNextResults']['conversationBar']['liveChatRenderer']['continuations'][0]['reloadContinuationData']['continuation'])
# no data yet but required to call _append_fragment
self._append_fragment(ctx, b'')
first = True
offset = None
frag_index = offset = 0
while continuation_id is not None:
data = None
if first:
url = 'https://www.youtube.com/live_chat_replay?continuation={}'.format(continuation_id)
success, raw_fragment = dl_fragment(url)
if not success:
return False
data = parse_yt_initial_data(raw_fragment)
else:
url = ('https://www.youtube.com/live_chat_replay/get_live_chat_replay'
+ '?continuation={}'.format(continuation_id)
+ '&playerOffsetMs={}'.format(max(offset - 5000, 0))
+ '&hidden=false'
+ '&pbj=1')
success, raw_fragment = dl_fragment(url)
if not success:
return False
data = json.loads(raw_fragment)['response']
first = False
continuation_id = None
live_chat_continuation = data['continuationContents']['liveChatContinuation']
offset = None
processed_fragment = bytearray()
if 'actions' in live_chat_continuation:
for action in live_chat_continuation['actions']:
if 'replayChatItemAction' in action:
replay_chat_item_action = action['replayChatItemAction']
offset = int(replay_chat_item_action['videoOffsetTimeMsec'])
processed_fragment.extend(
json.dumps(action, ensure_ascii=False).encode('utf-8') + b'\n')
try:
continuation_id = live_chat_continuation['continuations'][0]['liveChatReplayContinuationData']['continuation']
except KeyError:
continuation_id = None
self._append_fragment(ctx, processed_fragment)
if test or offset is None:
frag_index += 1
url = ''.join((
'https://www.youtube.com/live_chat_replay',
'/get_live_chat_replay' if frag_index > 1 else '',
'?continuation=%s' % continuation_id,
'&playerOffsetMs=%d&hidden=false&pbj=1' % max(offset - 5000, 0) if frag_index > 1 else ''))
success, continuation_id, offset = download_and_parse_fragment(url, frag_index)
if not success:
return False
if test:
break
self._finish_frag_download(ctx)
return True

View File

@@ -1,13 +1,20 @@
from __future__ import unicode_literals
from ..utils import load_plugins
try:
from .lazy_extractors import *
from .lazy_extractors import _ALL_CLASSES
_LAZY_LOADER = True
_PLUGIN_CLASSES = []
except ImportError:
_LAZY_LOADER = False
if not _LAZY_LOADER:
from .extractors import *
_PLUGIN_CLASSES = load_plugins('extractor', 'IE', globals())
_ALL_CLASSES = [
klass
for name, klass in globals().items()

View File

@@ -1,14 +1,15 @@
# coding: utf-8
from __future__ import unicode_literals
import calendar
import re
import time
from .amp import AMPIE
from .common import InfoExtractor
from .youtube import YoutubeIE
from ..compat import compat_urlparse
from ..utils import (
parse_duration,
parse_iso8601,
try_get,
)
class AbcNewsVideoIE(AMPIE):
@@ -18,8 +19,8 @@ class AbcNewsVideoIE(AMPIE):
(?:
abcnews\.go\.com/
(?:
[^/]+/video/(?P<display_id>[0-9a-z-]+)-|
video/embed\?.*?\bid=
(?:[^/]+/)*video/(?P<display_id>[0-9a-z-]+)-|
video/(?:embed|itemfeed)\?.*?\bid=
)|
fivethirtyeight\.abcnews\.go\.com/video/embed/\d+/
)
@@ -36,6 +37,8 @@ class AbcNewsVideoIE(AMPIE):
'description': 'George Stephanopoulos goes one-on-one with Iranian Foreign Minister Dr. Javad Zarif.',
'duration': 180,
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1380454200,
'upload_date': '20130929',
},
'params': {
# m3u8 download
@@ -47,6 +50,12 @@ class AbcNewsVideoIE(AMPIE):
}, {
'url': 'http://abcnews.go.com/2020/video/2020-husband-stands-teacher-jail-student-affairs-26119478',
'only_matching': True,
}, {
'url': 'http://abcnews.go.com/video/itemfeed?id=46979033',
'only_matching': True,
}, {
'url': 'https://abcnews.go.com/GMA/News/video/history-christmas-story-67894761',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -67,28 +76,23 @@ class AbcNewsIE(InfoExtractor):
_VALID_URL = r'https?://abcnews\.go\.com/(?:[^/]+/)+(?P<display_id>[0-9a-z-]+)/story\?id=(?P<id>\d+)'
_TESTS = [{
'url': 'http://abcnews.go.com/Blotter/News/dramatic-video-rare-death-job-america/story?id=10498713#.UIhwosWHLjY',
# Youtube Embeds
'url': 'https://abcnews.go.com/Entertainment/peter-billingsley-child-actor-christmas-story-hollywood-power/story?id=51286501',
'info_dict': {
'id': '10505354',
'ext': 'flv',
'display_id': 'dramatic-video-rare-death-job-america',
'title': 'Occupational Hazards',
'description': 'Nightline investigates the dangers that lurk at various jobs.',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20100428',
'timestamp': 1272412800,
'id': '51286501',
'title': "Peter Billingsley: From child actor in 'A Christmas Story' to Hollywood power player",
'description': 'Billingsley went from a child actor to Hollywood power player.',
},
'add_ie': ['AbcNewsVideo'],
'playlist_count': 5,
}, {
'url': 'http://abcnews.go.com/Entertainment/justin-timberlake-performs-stop-feeling-eurovision-2016/story?id=39125818',
'info_dict': {
'id': '38897857',
'ext': 'mp4',
'display_id': 'justin-timberlake-performs-stop-feeling-eurovision-2016',
'title': 'Justin Timberlake Drops Hints For Secret Single',
'description': 'Lara Spencer reports the buzziest stories of the day in "GMA" Pop News.',
'upload_date': '20160515',
'timestamp': 1463329500,
'upload_date': '20160505',
'timestamp': 1462442280,
},
'params': {
# m3u8 download
@@ -100,49 +104,55 @@ class AbcNewsIE(InfoExtractor):
}, {
'url': 'http://abcnews.go.com/Technology/exclusive-apple-ceo-tim-cook-iphone-cracking-software/story?id=37173343',
'only_matching': True,
}, {
# inline.type == 'video'
'url': 'http://abcnews.go.com/Technology/exclusive-apple-ceo-tim-cook-iphone-cracking-software/story?id=37173343',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
display_id = mobj.group('display_id')
video_id = mobj.group('id')
story_id = self._match_id(url)
webpage = self._download_webpage(url, story_id)
story = self._parse_json(self._search_regex(
r"window\['__abcnews__'\]\s*=\s*({.+?});",
webpage, 'data'), story_id)['page']['content']['story']['everscroll'][0]
article_contents = story.get('articleContents') or {}
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(
r'window\.abcnvideo\.url\s*=\s*"([^"]+)"', webpage, 'video URL')
full_video_url = compat_urlparse.urljoin(url, video_url)
def entries():
featured_video = story.get('featuredVideo') or {}
feed = try_get(featured_video, lambda x: x['video']['feed'])
if feed:
yield {
'_type': 'url',
'id': featured_video.get('id'),
'title': featured_video.get('name'),
'url': feed,
'thumbnail': featured_video.get('images'),
'description': featured_video.get('description'),
'timestamp': parse_iso8601(featured_video.get('uploadDate')),
'duration': parse_duration(featured_video.get('duration')),
'ie_key': AbcNewsVideoIE.ie_key(),
}
youtube_url = YoutubeIE._extract_url(webpage)
for inline in (article_contents.get('inlines') or []):
inline_type = inline.get('type')
if inline_type == 'iframe':
iframe_url = try_get(inline, lambda x: x['attrs']['src'])
if iframe_url:
yield self.url_result(iframe_url)
elif inline_type == 'video':
video_id = inline.get('id')
if video_id:
yield {
'_type': 'url',
'id': video_id,
'url': 'http://abcnews.go.com/video/embed?id=' + video_id,
'thumbnail': inline.get('imgSrc') or inline.get('imgDefault'),
'description': inline.get('description'),
'duration': parse_duration(inline.get('duration')),
'ie_key': AbcNewsVideoIE.ie_key(),
}
timestamp = None
date_str = self._html_search_regex(
r'<span[^>]+class="timestamp">([^<]+)</span>',
webpage, 'timestamp', fatal=False)
if date_str:
tz_offset = 0
if date_str.endswith(' ET'): # Eastern Time
tz_offset = -5
date_str = date_str[:-3]
date_formats = ['%b. %d, %Y', '%b %d, %Y, %I:%M %p']
for date_format in date_formats:
try:
timestamp = calendar.timegm(time.strptime(date_str.strip(), date_format))
except ValueError:
continue
if timestamp is not None:
timestamp -= tz_offset * 3600
entry = {
'_type': 'url_transparent',
'ie_key': AbcNewsVideoIE.ie_key(),
'url': full_video_url,
'id': video_id,
'display_id': display_id,
'timestamp': timestamp,
}
if youtube_url:
entries = [entry, self.url_result(youtube_url, ie=YoutubeIE.ie_key())]
return self.playlist_result(entries)
return entry
return self.playlist_result(
entries(), story_id, article_contents.get('headline'),
article_contents.get('subHead'))

View File

@@ -26,6 +26,7 @@ from ..utils import (
strip_or_none,
try_get,
unified_strdate,
urlencode_postdata,
)
@@ -51,9 +52,12 @@ class ADNIE(InfoExtractor):
}
}
_NETRC_MACHINE = 'animedigitalnetwork'
_BASE_URL = 'http://animedigitalnetwork.fr'
_API_BASE_URL = 'https://gw.api.animedigitalnetwork.fr/'
_PLAYER_BASE_URL = _API_BASE_URL + 'player/'
_HEADERS = {}
_LOGIN_ERR_MESSAGE = 'Unable to log in'
_RSA_KEY = (0x9B42B08905199A5CCE2026274399CA560ECB209EE9878A708B1C0812E1BB8CB5D1FB7441861147C1A1F2F3A0476DD63A9CAC20D3E983613346850AA6CB38F16DC7D720FD7D86FC6E5B3D5BBC72E14CD0BF9E869F2CEA2CCAD648F1DCE38F1FF916CEFB2D339B64AA0264372344BC775E265E8A852F88144AB0BD9AA06C1A4ABB, 65537)
_POS_ALIGN_MAP = {
'start': 1,
@@ -129,19 +133,42 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
}])
return subtitles
def _real_initialize(self):
username, password = self._get_login_info()
if not username:
return
try:
access_token = (self._download_json(
self._API_BASE_URL + 'authentication/login', None,
'Logging in', self._LOGIN_ERR_MESSAGE, fatal=False,
data=urlencode_postdata({
'password': password,
'rememberMe': False,
'source': 'Web',
'username': username,
})) or {}).get('accessToken')
if access_token:
self._HEADERS = {'authorization': 'Bearer ' + access_token}
except ExtractorError as e:
message = None
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
resp = self._parse_json(
e.cause.read().decode(), None, fatal=False) or {}
message = resp.get('message') or resp.get('code')
self.report_warning(message or self._LOGIN_ERR_MESSAGE)
def _real_extract(self, url):
video_id = self._match_id(url)
video_base_url = self._PLAYER_BASE_URL + 'video/%s/' % video_id
player = self._download_json(
video_base_url + 'configuration', video_id,
'Downloading player config JSON metadata')['player']
'Downloading player config JSON metadata',
headers=self._HEADERS)['player']
options = player['options']
user = options['user']
if not user.get('hasAccess'):
raise ExtractorError(
'This video is only available for paying users', expected=True)
# self.raise_login_required() # FIXME: Login is not implemented
self.raise_login_required()
token = self._download_json(
user.get('refreshTokenUrl') or (self._PLAYER_BASE_URL + 'refresh/token'),
@@ -188,8 +215,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
message = error.get('message')
if e.cause.code == 403 and error.get('code') == 'player-bad-geolocation-country':
self.raise_geo_restricted(msg=message)
else:
raise ExtractorError(message)
raise ExtractorError(message)
else:
raise ExtractorError('Giving up retrying')

View File

@@ -252,11 +252,11 @@ class AENetworksShowIE(AENetworksListBaseIE):
_TESTS = [{
'url': 'http://www.history.com/shows/ancient-aliens',
'info_dict': {
'id': 'SH012427480000',
'id': 'SERIES1574',
'title': 'Ancient Aliens',
'description': 'md5:3f6d74daf2672ff3ae29ed732e37ea7f',
},
'playlist_mincount': 168,
'playlist_mincount': 150,
}]
_RESOURCE = 'series'
_ITEMS_KEY = 'episodes'

View File

@@ -1,13 +1,16 @@
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
class AlJazeeraIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?aljazeera\.com/(?:programmes|video)/.*?/(?P<id>[^/]+)\.html'
_VALID_URL = r'https?://(?:www\.)?aljazeera\.com/(?P<type>program/[^/]+|(?:feature|video)s)/\d{4}/\d{1,2}/\d{1,2}/(?P<id>[^/?&#]+)'
_TESTS = [{
'url': 'http://www.aljazeera.com/programmes/the-slum/2014/08/deliverance-201482883754237240.html',
'url': 'https://www.aljazeera.com/program/episode/2014/9/19/deliverance',
'info_dict': {
'id': '3792260579001',
'ext': 'mp4',
@@ -20,14 +23,34 @@ class AlJazeeraIE(InfoExtractor):
'add_ie': ['BrightcoveNew'],
'skip': 'Not accessible from Travis CI server',
}, {
'url': 'http://www.aljazeera.com/video/news/2017/05/sierra-leone-709-carat-diamond-auctioned-170511100111930.html',
'url': 'https://www.aljazeera.com/videos/2017/5/11/sierra-leone-709-carat-diamond-to-be-auctioned-off',
'only_matching': True,
}, {
'url': 'https://www.aljazeera.com/features/2017/8/21/transforming-pakistans-buses-into-art',
'only_matching': True,
}]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/665003303001/default_default/index.html?videoId=%s'
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/%s_default/index.html?videoId=%s'
def _real_extract(self, url):
program_name = self._match_id(url)
webpage = self._download_webpage(url, program_name)
brightcove_id = self._search_regex(
r'RenderPagesVideo\(\'(.+?)\'', webpage, 'brightcove id')
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)
post_type, name = re.match(self._VALID_URL, url).groups()
post_type = {
'features': 'post',
'program': 'episode',
'videos': 'video',
}[post_type.split('/')[0]]
video = self._download_json(
'https://www.aljazeera.com/graphql', name, query={
'operationName': 'SingleArticleQuery',
'variables': json.dumps({
'name': name,
'postType': post_type,
}),
}, headers={
'wp-site': 'aje',
})['data']['article']['video']
video_id = video['id']
account_id = video.get('accountId') or '665003303001'
player_id = video.get('playerId') or 'BkeSH5BDb'
return self.url_result(
self.BRIGHTCOVE_URL_TEMPLATE % (account_id, player_id, video_id),
'BrightcoveNew', video_id)

View File

@@ -1,13 +1,16 @@
# coding: utf-8
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
from ..utils import (
clean_html,
int_or_none,
try_get,
unified_strdate,
unified_timestamp,
)
@@ -22,8 +25,8 @@ class AmericasTestKitchenIE(InfoExtractor):
'ext': 'mp4',
'description': 'md5:64e606bfee910627efc4b5f050de92b3',
'thumbnail': r're:^https?://',
'timestamp': 1523664000,
'upload_date': '20180414',
'timestamp': 1523318400,
'upload_date': '20180410',
'release_date': '20180410',
'series': "America's Test Kitchen",
'season_number': 18,
@@ -33,6 +36,27 @@ class AmericasTestKitchenIE(InfoExtractor):
'params': {
'skip_download': True,
},
}, {
# Metadata parsing behaves differently for newer episodes (705) as opposed to older episodes (582 above)
'url': 'https://www.americastestkitchen.com/episode/705-simple-chicken-dinner',
'md5': '06451608c57651e985a498e69cec17e5',
'info_dict': {
'id': '5fbe8c61bda2010001c6763b',
'title': 'Simple Chicken Dinner',
'ext': 'mp4',
'description': 'md5:eb68737cc2fd4c26ca7db30139d109e7',
'thumbnail': r're:^https?://',
'timestamp': 1610755200,
'upload_date': '20210116',
'release_date': '20210116',
'series': "America's Test Kitchen",
'season_number': 21,
'episode': 'Simple Chicken Dinner',
'episode_number': 3,
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.americastestkitchen.com/videos/3420-pan-seared-salmon',
'only_matching': True,
@@ -60,7 +84,76 @@ class AmericasTestKitchenIE(InfoExtractor):
'url': 'https://player.zype.com/embed/%s.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ' % video['zypeId'],
'ie_key': 'Zype',
'description': clean_html(video.get('description')),
'timestamp': unified_timestamp(video.get('publishDate')),
'release_date': unified_strdate(video.get('publishDate')),
'episode_number': int_or_none(episode.get('number')),
'season_number': int_or_none(episode.get('season')),
'series': try_get(episode, lambda x: x['show']['title']),
'episode': episode.get('title'),
}
class AmericasTestKitchenSeasonIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?P<show>americastestkitchen|cookscountry)\.com/episodes/browse/season_(?P<id>\d+)'
_TESTS = [{
# ATK Season
'url': 'https://www.americastestkitchen.com/episodes/browse/season_1',
'info_dict': {
'id': 'season_1',
'title': 'Season 1',
},
'playlist_count': 13,
}, {
# Cooks Country Season
'url': 'https://www.cookscountry.com/episodes/browse/season_12',
'info_dict': {
'id': 'season_12',
'title': 'Season 12',
},
'playlist_count': 13,
}]
def _real_extract(self, url):
show_name, season_number = re.match(self._VALID_URL, url).groups()
season_number = int(season_number)
slug = 'atk' if show_name == 'americastestkitchen' else 'cco'
season = 'Season %d' % season_number
season_search = self._download_json(
'https://y1fnzxui30-dsn.algolia.net/1/indexes/everest_search_%s_season_desc_production' % slug,
season, headers={
'Origin': 'https://www.%s.com' % show_name,
'X-Algolia-API-Key': '8d504d0099ed27c1b73708d22871d805',
'X-Algolia-Application-Id': 'Y1FNZXUI30',
}, query={
'facetFilters': json.dumps([
'search_season_list:' + season,
'search_document_klass:episode',
'search_show_slug:' + slug,
]),
'attributesToRetrieve': 'description,search_%s_episode_number,search_document_date,search_url,title' % slug,
'attributesToHighlight': '',
'hitsPerPage': 1000,
})
def entries():
for episode in (season_search.get('hits') or []):
search_url = episode.get('search_url')
if not search_url:
continue
yield {
'_type': 'url',
'url': 'https://www.%s.com%s' % (show_name, search_url),
'id': try_get(episode, lambda e: e['objectID'].split('_')[-1]),
'title': episode.get('title'),
'description': episode.get('description'),
'timestamp': unified_timestamp(episode.get('search_document_date')),
'season_number': season_number,
'episode_number': int_or_none(episode.get('search_%s_episode_number' % slug)),
'ie_key': AmericasTestKitchenIE.ie_key(),
}
return self.playlist_result(
entries(), 'season_%d' % season_number, season)

View File

@@ -8,6 +8,7 @@ from ..utils import (
int_or_none,
mimetype2ext,
parse_iso8601,
unified_timestamp,
url_or_none,
)
@@ -88,7 +89,7 @@ class AMPIE(InfoExtractor):
self._sort_formats(formats)
timestamp = parse_iso8601(item.get('pubDate'), ' ') or parse_iso8601(item.get('dc-date'))
timestamp = unified_timestamp(item.get('pubDate'), ' ') or parse_iso8601(item.get('dc-date'))
return {
'id': video_id,

View File

@@ -21,6 +21,16 @@ from ..utils import (
unsmuggle_url,
)
# This import causes a ModuleNotFoundError on some systems for unknown reason.
# See issues:
# https://github.com/pukkandan/yt-dlp/issues/35
# https://github.com/ytdl-org/youtube-dl/issues/27449
# https://github.com/animelover1984/youtube-dl/issues/17
try:
from .anvato_token_generator import NFLTokenGenerator
except ImportError:
NFLTokenGenerator = None
def md5_text(s):
if not isinstance(s, compat_str):
@@ -203,6 +213,10 @@ class AnvatoIE(InfoExtractor):
'telemundo': 'anvato_mcp_telemundo_web_prod_c5278d51ad46fda4b6ca3d0ea44a7846a054f582'
}
_TOKEN_GENERATORS = {
'GXvEgwyJeWem8KCYXfeoHWknwP48Mboj': NFLTokenGenerator,
}
_API_KEY = '3hwbSuqqT690uxjNYBktSQpa5ZrpYYR0Iofx7NcJHyA'
_ANVP_RE = r'<script[^>]+\bdata-anvp\s*=\s*(["\'])(?P<anvp>(?:(?!\1).)+)\1'
@@ -262,9 +276,12 @@ class AnvatoIE(InfoExtractor):
'anvrid': anvrid,
'anvts': server_time,
}
api['anvstk'] = md5_text('%s|%s|%d|%s' % (
access_key, anvrid, server_time,
self._ANVACK_TABLE.get(access_key, self._API_KEY)))
if self._TOKEN_GENERATORS.get(access_key) is not None:
api['anvstk2'] = self._TOKEN_GENERATORS[access_key].generate(self, access_key, video_id)
else:
api['anvstk'] = md5_text('%s|%s|%d|%s' % (
access_key, anvrid, server_time,
self._ANVACK_TABLE.get(access_key, self._API_KEY)))
return self._download_json(
video_data_url, video_id, transform_source=strip_jsonp,

View File

@@ -3,7 +3,7 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .yahoo import YahooIE
from ..compat import (
compat_parse_qs,
compat_urllib_parse_urlparse,
@@ -15,9 +15,9 @@ from ..utils import (
)
class AolIE(InfoExtractor):
class AolIE(YahooIE):
IE_NAME = 'aol.com'
_VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>[0-9a-f]+)'
_VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>\d{9}|[0-9a-f]{24}|[0-9a-f]{8}-(?:[0-9a-f]{4}-){3}[0-9a-f]{12})'
_TESTS = [{
# video with 5min ID
@@ -76,10 +76,16 @@ class AolIE(InfoExtractor):
}, {
'url': 'https://www.aol.jp/video/playlist/5a28e936a1334d000137da0c/5a28f3151e642219fde19831/',
'only_matching': True,
}, {
# Yahoo video
'url': 'https://www.aol.com/video/play/991e6700-ac02-11ea-99ff-357400036f61/24bbc846-3e30-3c46-915e-fe8ccd7fcc46/',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
if '-' in video_id:
return self._extract_yahoo_video(video_id, 'us')
response = self._download_json(
'https://feedapi.b2c.on.aol.com/v1.0/app/videos/aolon/%s/details' % video_id,

View File

@@ -226,13 +226,13 @@ class ARDMediathekIE(ARDMediathekBaseIE):
if doc.tag == 'rss':
return GenericIE()._extract_rss(url, video_id, doc)
title = self._html_search_regex(
title = self._og_search_title(webpage, default=None) or self._html_search_regex(
[r'<h1(?:\s+class="boxTopHeadline")?>(.*?)</h1>',
r'<meta name="dcterms\.title" content="(.*?)"/>',
r'<h4 class="headline">(.*?)</h4>',
r'<title[^>]*>(.*?)</title>'],
webpage, 'title')
description = self._html_search_meta(
description = self._og_search_description(webpage, default=None) or self._html_search_meta(
'dcterms.abstract', webpage, 'description', default=None)
if description is None:
description = self._html_search_meta(
@@ -289,18 +289,18 @@ class ARDMediathekIE(ARDMediathekBaseIE):
class ARDIE(InfoExtractor):
_VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos(?:extern)?/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
_VALID_URL = r'(?P<mainurl>https?://(?:www\.)?daserste\.de/[^?#]+/videos(?:extern)?/(?P<display_id>[^/?#]+)-(?:video-?)?(?P<id>[0-9]+))\.html'
_TESTS = [{
# available till 14.02.2019
'url': 'http://www.daserste.de/information/talk/maischberger/videos/das-groko-drama-zerlegen-sich-die-volksparteien-video-102.html',
'md5': '8e4ec85f31be7c7fc08a26cdbc5a1f49',
# available till 7.01.2022
'url': 'https://www.daserste.de/information/talk/maischberger/videos/maischberger-die-woche-video100.html',
'md5': '867d8aa39eeaf6d76407c5ad1bb0d4c1',
'info_dict': {
'display_id': 'das-groko-drama-zerlegen-sich-die-volksparteien-video',
'id': '102',
'display_id': 'maischberger-die-woche',
'id': '100',
'ext': 'mp4',
'duration': 4435.0,
'title': 'Das GroKo-Drama: Zerlegen sich die Volksparteien?',
'upload_date': '20180214',
'duration': 3687.0,
'title': 'maischberger. die woche vom 7. Januar 2021',
'upload_date': '20210107',
'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
@@ -355,17 +355,17 @@ class ARDIE(InfoExtractor):
class ARDBetaMediathekIE(ARDMediathekBaseIE):
_VALID_URL = r'https://(?:(?:beta|www)\.)?ardmediathek\.de/(?P<client>[^/]+)/(?P<mode>player|live|video|sendung|sammlung)/(?P<display_id>(?:[^/]+/)*)(?P<video_id>[a-zA-Z0-9]+)'
_TESTS = [{
'url': 'https://ardmediathek.de/ard/video/die-robuste-roswita/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
'md5': 'dfdc87d2e7e09d073d5a80770a9ce88f',
'url': 'https://www.ardmediathek.de/mdr/video/die-robuste-roswita/Y3JpZDovL21kci5kZS9iZWl0cmFnL2Ntcy84MWMxN2MzZC0wMjkxLTRmMzUtODk4ZS0wYzhlOWQxODE2NGI/',
'md5': 'a1dc75a39c61601b980648f7c9f9f71d',
'info_dict': {
'display_id': 'die-robuste-roswita',
'id': '70153354',
'id': '78566716',
'title': 'Die robuste Roswita',
'description': r're:^Der Mord.*trüber ist als die Ilm.',
'description': r're:^Der Mord.*totgeglaubte Ehefrau Roswita',
'duration': 5316,
'thumbnail': 'https://img.ardmediathek.de/standard/00/70/15/33/90/-1852531467/16x9/960?mandant=ard',
'timestamp': 1577047500,
'upload_date': '20191222',
'thumbnail': 'https://img.ardmediathek.de/standard/00/78/56/67/84/575672121/16x9/960?mandant=ard',
'timestamp': 1596658200,
'upload_date': '20200805',
'ext': 'mp4',
},
}, {

View File

@@ -0,0 +1,247 @@
# coding: utf-8
from __future__ import unicode_literals
import random
import re
from .common import InfoExtractor
from ..utils import ExtractorError, try_get, compat_str, str_or_none
from ..compat import compat_urllib_parse_unquote
class AudiusBaseIE(InfoExtractor):
_API_BASE = None
_API_V = '/v1'
def _get_response_data(self, response):
if isinstance(response, dict):
response_data = response.get('data')
if response_data is not None:
return response_data
if len(response) == 1 and 'message' in response:
raise ExtractorError('API error: %s' % response['message'],
expected=True)
raise ExtractorError('Unexpected API response')
def _select_api_base(self):
"""Selecting one of the currently available API hosts"""
response = super(AudiusBaseIE, self)._download_json(
'https://api.audius.co/', None,
note='Requesting available API hosts',
errnote='Unable to request available API hosts')
hosts = self._get_response_data(response)
if isinstance(hosts, list):
self._API_BASE = random.choice(hosts)
return
raise ExtractorError('Unable to get available API hosts')
@staticmethod
def _prepare_url(url, title):
"""
Audius removes forward slashes from the uri, but leaves backslashes.
The problem is that the current version of Chrome replaces backslashes
in the address bar with a forward slashes, so if you copy the link from
there and paste it into youtube-dl, you won't be able to download
anything from this link, since the Audius API won't be able to resolve
this url
"""
url = compat_urllib_parse_unquote(url)
title = compat_urllib_parse_unquote(title)
if '/' in title or '%2F' in title:
fixed_title = title.replace('/', '%5C').replace('%2F', '%5C')
return url.replace(title, fixed_title)
return url
def _api_request(self, path, item_id=None, note='Downloading JSON metadata',
errnote='Unable to download JSON metadata',
expected_status=None):
if self._API_BASE is None:
self._select_api_base()
try:
response = super(AudiusBaseIE, self)._download_json(
'%s%s%s' % (self._API_BASE, self._API_V, path), item_id, note=note,
errnote=errnote, expected_status=expected_status)
except ExtractorError as exc:
# some of Audius API hosts may not work as expected and return HTML
if 'Failed to parse JSON' in compat_str(exc):
raise ExtractorError('An error occurred while receiving data. Try again',
expected=True)
raise exc
return self._get_response_data(response)
def _resolve_url(self, url, item_id):
return self._api_request('/resolve?url=%s' % url, item_id,
expected_status=404)
class AudiusIE(AudiusBaseIE):
_VALID_URL = r'''(?x)https?://(?:www\.)?(?:audius\.co/(?P<uploader>[\w\d-]+)(?!/album|/playlist)/(?P<title>\S+))'''
IE_DESC = 'Audius.co'
_TESTS = [
{
# URL from Chrome address bar which replace backslash to forward slash
'url': 'https://audius.co/test_acc/t%D0%B5%D0%B5%D0%B5est-1.%5E_%7B%7D/%22%3C%3E.%E2%84%96~%60-198631',
'md5': '92c35d3e754d5a0f17eef396b0d33582',
'info_dict': {
'id': 'xd8gY',
'title': '''Tеееest/ 1.!@#$%^&*()_+=[]{};'\\\":<>,.?/№~`''',
'ext': 'mp3',
'description': 'Description',
'duration': 30,
'track': '''Tеееest/ 1.!@#$%^&*()_+=[]{};'\\\":<>,.?/№~`''',
'artist': 'test',
'genre': 'Electronic',
'thumbnail': r're:https?://.*\.jpg',
'view_count': int,
'like_count': int,
'repost_count': int,
}
},
{
# Regular track
'url': 'https://audius.co/voltra/radar-103692',
'md5': '491898a0a8de39f20c5d6a8a80ab5132',
'info_dict': {
'id': 'KKdy2',
'title': 'RADAR',
'ext': 'mp3',
'duration': 318,
'track': 'RADAR',
'artist': 'voltra',
'genre': 'Trance',
'thumbnail': r're:https?://.*\.jpg',
'view_count': int,
'like_count': int,
'repost_count': int,
}
},
]
_ARTWORK_MAP = {
"150x150": 150,
"480x480": 480,
"1000x1000": 1000
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
track_id = try_get(mobj, lambda x: x.group('track_id'))
if track_id is None:
title = mobj.group('title')
# uploader = mobj.group('uploader')
url = self._prepare_url(url, title)
track_data = self._resolve_url(url, title)
else: # API link
title = None
# uploader = None
track_data = self._api_request('/tracks/%s' % track_id, track_id)
if not isinstance(track_data, dict):
raise ExtractorError('Unexpected API response')
track_id = track_data.get('id')
if track_id is None:
raise ExtractorError('Unable to get ID of the track')
artworks_data = track_data.get('artwork')
thumbnails = []
if isinstance(artworks_data, dict):
for quality_key, thumbnail_url in artworks_data.items():
thumbnail = {
"url": thumbnail_url
}
quality_code = self._ARTWORK_MAP.get(quality_key)
if quality_code is not None:
thumbnail['preference'] = quality_code
thumbnails.append(thumbnail)
return {
'id': track_id,
'title': track_data.get('title', title),
'url': '%s/v1/tracks/%s/stream' % (self._API_BASE, track_id),
'ext': 'mp3',
'description': track_data.get('description'),
'duration': track_data.get('duration'),
'track': track_data.get('title'),
'artist': try_get(track_data, lambda x: x['user']['name'], compat_str),
'genre': track_data.get('genre'),
'thumbnails': thumbnails,
'view_count': track_data.get('play_count'),
'like_count': track_data.get('favorite_count'),
'repost_count': track_data.get('repost_count'),
}
class AudiusTrackIE(AudiusIE):
_VALID_URL = r'''(?x)(?:audius:)(?:https?://(?:www\.)?.+/v1/tracks/)?(?P<track_id>\w+)'''
IE_NAME = 'audius:track'
IE_DESC = 'Audius track ID or API link. Prepend with "audius:"'
_TESTS = [
{
'url': 'audius:9RWlo',
'only_matching': True
},
{
'url': 'audius:http://discoveryprovider.audius.prod-us-west-2.staked.cloud/v1/tracks/9RWlo',
'only_matching': True
},
]
class AudiusPlaylistIE(AudiusBaseIE):
_VALID_URL = r'https?://(?:www\.)?audius\.co/(?P<uploader>[\w\d-]+)/(?:album|playlist)/(?P<title>\S+)'
IE_NAME = 'audius:playlist'
IE_DESC = 'Audius.co playlists'
_TEST = {
'url': 'https://audius.co/test_acc/playlist/test-playlist-22910',
'info_dict': {
'id': 'DNvjN',
'title': 'test playlist',
'description': 'Test description\n\nlol',
},
'playlist_count': 175,
}
def _build_playlist(self, tracks):
entries = []
for track in tracks:
if not isinstance(track, dict):
raise ExtractorError('Unexpected API response')
track_id = str_or_none(track.get('id'))
if not track_id:
raise ExtractorError('Unable to get track ID from playlist')
entries.append(self.url_result(
'audius:%s' % track_id,
ie=AudiusTrackIE.ie_key(), video_id=track_id))
return entries
def _real_extract(self, url):
self._select_api_base()
mobj = re.match(self._VALID_URL, url)
title = mobj.group('title')
# uploader = mobj.group('uploader')
url = self._prepare_url(url, title)
playlist_response = self._resolve_url(url, title)
if not isinstance(playlist_response, list) or len(playlist_response) != 1:
raise ExtractorError('Unexpected API response')
playlist_data = playlist_response[0]
if not isinstance(playlist_data, dict):
raise ExtractorError('Unexpected API response')
playlist_id = playlist_data.get('id')
if playlist_id is None:
raise ExtractorError('Unable to get playlist ID')
playlist_tracks = self._api_request(
'/playlists/%s/tracks' % playlist_id,
title, note='Downloading playlist tracks metadata',
errnote='Unable to download playlist tracks metadata')
if not isinstance(playlist_tracks, list):
raise ExtractorError('Unexpected API response')
entries = self._build_playlist(playlist_tracks)
return self.playlist_result(entries, playlist_id,
playlist_data.get('playlist_name', title),
playlist_data.get('description'))

View File

@@ -48,6 +48,7 @@ class AWAANBaseIE(InfoExtractor):
'duration': int_or_none(video_data.get('duration')),
'timestamp': parse_iso8601(video_data.get('create_time'), ' '),
'is_live': is_live,
'uploader_id': video_data.get('user_id'),
}
@@ -107,6 +108,7 @@ class AWAANLiveIE(AWAANBaseIE):
'title': 're:Dubai Al Oula [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'upload_date': '20150107',
'timestamp': 1420588800,
'uploader_id': '71',
},
'params': {
# m3u8 download

View File

@@ -47,7 +47,7 @@ class AZMedienIE(InfoExtractor):
'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1',
'only_matching': True
}]
_API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/cb9f2f81ed22e9b47f4ca64ea3cc5a5d13e88d1d'
_API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/a4016f65fe62b81dc6664dd9f4910e4ab40383be'
_PARTNER_ID = '1719221'
def _real_extract(self, url):

View File

@@ -2,9 +2,10 @@
from __future__ import unicode_literals
import hashlib
import json
import re
from .common import InfoExtractor
from .common import InfoExtractor, SearchInfoExtractor
from ..compat import (
compat_parse_qs,
compat_urlparse,
@@ -32,13 +33,14 @@ class BiliBiliIE(InfoExtractor):
(?:
video/[aA][vV]|
anime/(?P<anime_id>\d+)/play\#
)(?P<id_bv>\d+)|
video/[bB][vV](?P<id>[^/?#&]+)
)(?P<id>\d+)|
video/[bB][vV](?P<id_bv>[^/?#&]+)
)
(?:/?\?p=(?P<page>\d+))?
'''
_TESTS = [{
'url': 'http://www.bilibili.tv/video/av1074402/',
'url': 'http://www.bilibili.com/video/av1074402/',
'md5': '5f7d29e1a2872f3df0cf76b1f87d3788',
'info_dict': {
'id': '1074402',
@@ -56,6 +58,10 @@ class BiliBiliIE(InfoExtractor):
# Tested in BiliBiliBangumiIE
'url': 'http://bangumi.bilibili.com/anime/1869/play#40062',
'only_matching': True,
}, {
# bilibili.tv
'url': 'http://www.bilibili.tv/video/av1074402/',
'only_matching': True,
}, {
'url': 'http://bangumi.bilibili.com/anime/5802/play#100643',
'md5': '3f721ad1e75030cc06faf73587cfec57',
@@ -124,12 +130,20 @@ class BiliBiliIE(InfoExtractor):
url, smuggled_data = unsmuggle_url(url, {})
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id') or mobj.group('id_bv')
video_id = mobj.group('id_bv') or mobj.group('id')
av_id, bv_id = self._get_video_id_set(video_id, mobj.group('id_bv') is not None)
video_id = av_id
anime_id = mobj.group('anime_id')
page_id = mobj.group('page')
webpage = self._download_webpage(url, video_id)
if 'anime/' not in url:
cid = self._search_regex(
r'\bcid(?:["\']:|=)(\d+),["\']page(?:["\']:|=)' + str(page_id), webpage, 'cid',
default=None
) or self._search_regex(
r'\bcid(?:["\']:|=)(\d+)', webpage, 'cid',
default=None
) or compat_parse_qs(self._search_regex(
@@ -207,9 +221,9 @@ class BiliBiliIE(InfoExtractor):
break
title = self._html_search_regex(
('<h1[^>]+\btitle=(["\'])(?P<title>(?:(?!\1).)+)\1',
'(?s)<h1[^>]*>(?P<title>.+?)</h1>'), webpage, 'title',
group='title')
(r'<h1[^>]+\btitle=(["\'])(?P<title>(?:(?!\1).)+)\1',
r'(?s)<h1[^>]*>(?P<title>.+?)</h1>'), webpage, 'title',
group='title') + ('_p' + str(page_id) if page_id is not None else '')
description = self._html_search_meta('description', webpage)
timestamp = unified_timestamp(self._html_search_regex(
r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time',
@@ -219,7 +233,8 @@ class BiliBiliIE(InfoExtractor):
# TODO 'view_count' requires deobfuscating Javascript
info = {
'id': video_id,
'id': str(video_id) if page_id is None else '%s_p%s' % (video_id, page_id),
'cid': cid,
'title': title,
'description': description,
'timestamp': timestamp,
@@ -235,27 +250,134 @@ class BiliBiliIE(InfoExtractor):
'uploader': uploader_mobj.group('name'),
'uploader_id': uploader_mobj.group('id'),
})
if not info.get('uploader'):
info['uploader'] = self._html_search_meta(
'author', webpage, 'uploader', default=None)
comments = None
if self._downloader.params.get('getcomments', False):
comments = self._get_all_comment_pages(video_id)
raw_danmaku = self._get_raw_danmaku(video_id, cid)
raw_tags = self._get_tags(video_id)
tags = list(map(lambda x: x['tag_name'], raw_tags))
top_level_info = {
'raw_danmaku': raw_danmaku,
'comments': comments,
'comment_count': len(comments) if comments is not None else None,
'tags': tags,
'raw_tags': raw_tags,
}
'''
# Requires https://github.com/m13253/danmaku2ass which is licenced under GPL3
# See https://github.com/animelover1984/youtube-dl
danmaku = NiconicoIE.CreateDanmaku(raw_danmaku, commentType='Bilibili', x=1024, y=576)
entries[0]['subtitles'] = {
'danmaku': [{
'ext': 'ass',
'data': danmaku
}]
}
'''
for entry in entries:
entry.update(info)
if len(entries) == 1:
entries[0].update(top_level_info)
return entries[0]
else:
for idx, entry in enumerate(entries):
entry['id'] = '%s_part%d' % (video_id, (idx + 1))
return {
global_info = {
'_type': 'multi_video',
'id': video_id,
'bv_id': bv_id,
'title': title,
'description': description,
'entries': entries,
}
global_info.update(info)
global_info.update(top_level_info)
return global_info
def _get_video_id_set(self, id, is_bv):
query = {'bvid': id} if is_bv else {'aid': id}
response = self._download_json(
"http://api.bilibili.cn/x/web-interface/view",
id, query=query,
note='Grabbing original ID via API')
if response['code'] == -400:
raise ExtractorError('Video ID does not exist', expected=True, video_id=id)
elif response['code'] != 0:
raise ExtractorError('Unknown error occurred during API check (code %s)' % response['code'], expected=True, video_id=id)
return (response['data']['aid'], response['data']['bvid'])
# recursive solution to getting every page of comments for the video
# we can stop when we reach a page without any comments
def _get_all_comment_pages(self, video_id, commentPageNumber=0):
comment_url = "https://api.bilibili.com/x/v2/reply?jsonp=jsonp&pn=%s&type=1&oid=%s&sort=2&_=1567227301685" % (commentPageNumber, video_id)
json_str = self._download_webpage(
comment_url, video_id,
note='Extracting comments from page %s' % (commentPageNumber))
replies = json.loads(json_str)['data']['replies']
if replies is None:
return []
return self._get_all_children(replies) + self._get_all_comment_pages(video_id, commentPageNumber + 1)
# extracts all comments in the tree
def _get_all_children(self, replies):
if replies is None:
return []
ret = []
for reply in replies:
author = reply['member']['uname']
author_id = reply['member']['mid']
id = reply['rpid']
text = reply['content']['message']
timestamp = reply['ctime']
parent = reply['parent'] if reply['parent'] != 0 else 'root'
comment = {
"author": author,
"author_id": author_id,
"id": id,
"text": text,
"timestamp": timestamp,
"parent": parent,
}
ret.append(comment)
# from the JSON, the comment structure seems arbitrarily deep, but I could be wrong.
# Regardless, this should work.
ret += self._get_all_children(reply['replies'])
return ret
def _get_raw_danmaku(self, video_id, cid):
# This will be useful if I decide to scrape all pages instead of doing them individually
# cid_url = "https://www.bilibili.com/widget/getPageList?aid=%s" % (video_id)
# cid_str = self._download_webpage(cid_url, video_id, note=False)
# cid = json.loads(cid_str)[0]['cid']
danmaku_url = "https://comment.bilibili.com/%s.xml" % (cid)
danmaku = self._download_webpage(danmaku_url, video_id, note='Downloading danmaku comments')
return danmaku
def _get_tags(self, video_id):
tags_url = "https://api.bilibili.com/x/tag/archive/tags?aid=%s" % (video_id)
tags_json = self._download_json(tags_url, video_id, note='Downloading tags')
return tags_json['data']
class BiliBiliBangumiIE(InfoExtractor):
_VALID_URL = r'https?://bangumi\.bilibili\.com/anime/(?P<id>\d+)'
@@ -324,6 +446,73 @@ class BiliBiliBangumiIE(InfoExtractor):
season_info.get('bangumi_title'), season_info.get('evaluate'))
class BilibiliChannelIE(InfoExtractor):
_VALID_URL = r'https?://space.bilibili\.com/(?P<id>\d+)'
# May need to add support for pagination? Need to find a user with many video uploads to test
_API_URL = "https://api.bilibili.com/x/space/arc/search?mid=%s&pn=1&ps=25&jsonp=jsonp"
_TEST = {} # TODO: Add tests
def _real_extract(self, url):
list_id = self._match_id(url)
json_str = self._download_webpage(self._API_URL % list_id, "None")
json_parsed = json.loads(json_str)
entries = [{
'_type': 'url',
'ie_key': BiliBiliIE.ie_key(),
'url': ('https://www.bilibili.com/video/%s' %
entry['bvid']),
'id': entry['bvid'],
} for entry in json_parsed['data']['list']['vlist']]
return {
'_type': 'playlist',
'id': list_id,
'entries': entries
}
class BiliBiliSearchIE(SearchInfoExtractor):
IE_DESC = 'Bilibili video search, "bilisearch" keyword'
_MAX_RESULTS = 100000
_SEARCH_KEY = 'bilisearch'
MAX_NUMBER_OF_RESULTS = 1000
def _get_n_results(self, query, n):
"""Get a specified number of results for a query"""
entries = []
pageNumber = 0
while True:
pageNumber += 1
# FIXME
api_url = "https://api.bilibili.com/x/web-interface/search/type?context=&page=%s&order=pubdate&keyword=%s&duration=0&tids_2=&__refresh__=true&search_type=video&tids=0&highlight=1" % (pageNumber, query)
json_str = self._download_webpage(
api_url, "None", query={"Search_key": query},
note='Extracting results from page %s' % pageNumber)
data = json.loads(json_str)['data']
# FIXME: this is hideous
if "result" not in data:
return {
'_type': 'playlist',
'id': query,
'entries': entries[:n]
}
videos = data['result']
for video in videos:
e = self.url_result(video['arcurl'], 'BiliBili', str(video['aid']))
entries.append(e)
if(len(entries) >= n or len(videos) >= BiliBiliSearchIE.MAX_NUMBER_OF_RESULTS):
return {
'_type': 'playlist',
'id': query,
'entries': entries[:n]
}
class BilibiliAudioBaseIE(InfoExtractor):
def _call_api(self, path, sid, query=None):
if not query:

View File

@@ -90,13 +90,19 @@ class BleacherReportCMSIE(AMPIE):
_VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36}|\d{5})'
_TESTS = [{
'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1&library=video-cms',
'md5': '2e4b0a997f9228ffa31fada5c53d1ed1',
'md5': '670b2d73f48549da032861130488c681',
'info_dict': {
'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
'ext': 'flv',
'ext': 'mp4',
'title': 'Cena vs. Rollins Would Expose the Heavyweight Division',
'description': 'md5:984afb4ade2f9c0db35f3267ed88b36e',
'upload_date': '20150723',
'timestamp': 1437679032,
},
'expected_warnings': [
'Unable to download f4m manifest'
]
}]
def _real_extract(self, url):

View File

@@ -12,7 +12,7 @@ from ..utils import (
class BravoTVIE(AdobePassIE):
_VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
_VALID_URL = r'https?://(?:www\.)?(?P<req_id>bravotv|oxygen)\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-season-16-winner-is',
'md5': 'e34684cfea2a96cd2ee1ef3a60909de9',
@@ -28,10 +28,13 @@ class BravoTVIE(AdobePassIE):
}, {
'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1',
'only_matching': True,
}, {
'url': 'https://www.oxygen.com/in-ice-cold-blood/season-2/episode-16/videos/handling-the-horwitz-house-after-the-murder-season-2',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
site, display_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, display_id)
settings = self._parse_json(self._search_regex(
r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>({.+?})</script>', webpage, 'drupal settings'),
@@ -53,11 +56,14 @@ class BravoTVIE(AdobePassIE):
tp_path = release_pid = tve['release_pid']
if tve.get('entitlement') == 'auth':
adobe_pass = settings.get('tve_adobe_auth', {})
if site == 'bravotv':
site = 'bravo'
resource = self._get_mvpd_resource(
adobe_pass.get('adobePassResourceId', 'bravo'),
adobe_pass.get('adobePassResourceId') or site,
tve['title'], release_pid, tve.get('rating'))
query['auth'] = self._extract_mvpd_auth(
url, release_pid, adobe_pass.get('adobePassRequestorId', 'bravo'), resource)
url, release_pid,
adobe_pass.get('adobePassRequestorId') or site, resource)
else:
shared_playlist = settings['ls_playlist']
account_pid = shared_playlist['account_pid']

View File

@@ -59,7 +59,7 @@ class CBSIE(CBSBaseIE):
'http://can.cbs.com/thunder/player/videoPlayerService.php',
content_id, query={'partner': site, 'contentId': content_id})
video_data = xpath_element(items_data, './/item')
title = xpath_text(video_data, 'videoTitle', 'title', True)
title = xpath_text(video_data, 'videoTitle', 'title') or xpath_text(video_data, 'videotitle', 'title')
tp_path = 'dJ5BDC/media/guid/%d/%s' % (mpx_acc, content_id)
tp_release_url = 'http://link.theplatform.com/s/' + tp_path

View File

@@ -1,6 +1,7 @@
# coding: utf-8
from __future__ import unicode_literals
import datetime
import re
from .common import InfoExtractor
@@ -8,8 +9,8 @@ from ..utils import (
clean_html,
int_or_none,
parse_duration,
parse_iso8601,
parse_resolution,
try_get,
url_or_none,
)
@@ -24,8 +25,9 @@ class CCMAIE(InfoExtractor):
'ext': 'mp4',
'title': 'L\'espot de La Marató de TV3',
'description': 'md5:f12987f320e2f6e988e9908e4fe97765',
'timestamp': 1470918540,
'upload_date': '20160811',
'timestamp': 1478608140,
'upload_date': '20161108',
'age_limit': 0,
}
}, {
'url': 'http://www.ccma.cat/catradio/alacarta/programa/el-consell-de-savis-analitza-el-derbi/audio/943685/',
@@ -35,8 +37,24 @@ class CCMAIE(InfoExtractor):
'ext': 'mp3',
'title': 'El Consell de Savis analitza el derbi',
'description': 'md5:e2a3648145f3241cb9c6b4b624033e53',
'upload_date': '20171205',
'timestamp': 1512507300,
'upload_date': '20170512',
'timestamp': 1494622500,
'vcodec': 'none',
'categories': ['Esports'],
}
}, {
'url': 'http://www.ccma.cat/tv3/alacarta/crims/crims-josep-tallada-lespereu-me-capitol-1/video/6031387/',
'md5': 'b43c3d3486f430f3032b5b160d80cbc3',
'info_dict': {
'id': '6031387',
'ext': 'mp4',
'title': 'Crims - Josep Talleda, l\'"Espereu-me" (capítol 1)',
'description': 'md5:7cbdafb640da9d0d2c0f62bad1e74e60',
'timestamp': 1582577700,
'upload_date': '20200224',
'subtitles': 'mincount:4',
'age_limit': 16,
'series': 'Crims',
}
}]
@@ -72,17 +90,27 @@ class CCMAIE(InfoExtractor):
informacio = media['informacio']
title = informacio['titol']
durada = informacio.get('durada', {})
durada = informacio.get('durada') or {}
duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text'))
timestamp = parse_iso8601(informacio.get('data_emissio', {}).get('utc'))
tematica = try_get(informacio, lambda x: x['tematica']['text'])
timestamp = None
data_utc = try_get(informacio, lambda x: x['data_emissio']['utc'])
try:
timestamp = datetime.datetime.strptime(
data_utc, '%Y-%d-%mT%H:%M:%S%z').timestamp()
except TypeError:
pass
subtitles = {}
subtitols = media.get('subtitols', {})
if subtitols:
sub_url = subtitols.get('url')
subtitols = media.get('subtitols') or []
if isinstance(subtitols, dict):
subtitols = [subtitols]
for st in subtitols:
sub_url = st.get('url')
if sub_url:
subtitles.setdefault(
subtitols.get('iso') or subtitols.get('text') or 'ca', []).append({
st.get('iso') or st.get('text') or 'ca', []).append({
'url': sub_url,
})
@@ -97,6 +125,16 @@ class CCMAIE(InfoExtractor):
'height': int_or_none(imatges.get('alcada')),
}]
age_limit = None
codi_etic = try_get(informacio, lambda x: x['codi_etic']['id'])
if codi_etic:
codi_etic_s = codi_etic.split('_')
if len(codi_etic_s) == 2:
if codi_etic_s[1] == 'TP':
age_limit = 0
else:
age_limit = int_or_none(codi_etic_s[1])
return {
'id': media_id,
'title': title,
@@ -106,4 +144,9 @@ class CCMAIE(InfoExtractor):
'thumbnails': thumbnails,
'subtitles': subtitles,
'formats': formats,
'age_limit': age_limit,
'alt_title': informacio.get('titol_complet'),
'episode_number': int_or_none(informacio.get('capitol')),
'categories': [tematica] if tematica else None,
'series': informacio.get('programa'),
}

View File

@@ -96,7 +96,7 @@ class CDAIE(InfoExtractor):
raise ExtractorError('This video is only available for premium users.', expected=True)
need_confirm_age = False
if self._html_search_regex(r'(<form[^>]+action="/a/validatebirth")',
if self._html_search_regex(r'(<form[^>]+action="[^"]*/a/validatebirth[^"]*")',
webpage, 'birthday validate form', default=None):
webpage = self._download_age_confirm_page(
url, video_id, note='Confirming age')

View File

@@ -1,142 +1,51 @@
from __future__ import unicode_literals
from .mtv import MTVServicesInfoExtractor
from .common import InfoExtractor
class ComedyCentralIE(MTVServicesInfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/
(video-clips|episodes|cc-studios|video-collections|shows(?=/[^/]+/(?!full-episodes)))
/(?P<title>.*)'''
_VALID_URL = r'https?://(?:www\.)?cc\.com/(?:episodes|video(?:-clips)?)/(?P<id>[0-9a-z]{6})'
_FEED_URL = 'http://comedycentral.com/feeds/mrss/'
_TESTS = [{
'url': 'http://www.cc.com/video-clips/kllhuv/stand-up-greg-fitzsimmons--uncensored---too-good-of-a-mother',
'md5': 'c4f48e9eda1b16dd10add0744344b6d8',
'url': 'http://www.cc.com/video-clips/5ke9v2/the-daily-show-with-trevor-noah-doc-rivers-and-steve-ballmer---the-nba-player-strike',
'md5': 'b8acb347177c680ff18a292aa2166f80',
'info_dict': {
'id': 'cef0cbb3-e776-4bc9-b62e-8016deccb354',
'id': '89ccc86e-1b02-4f83-b0c9-1d9592ecd025',
'ext': 'mp4',
'title': 'CC:Stand-Up|August 18, 2013|1|0101|Uncensored - Too Good of a Mother',
'description': 'After a certain point, breastfeeding becomes c**kblocking.',
'timestamp': 1376798400,
'upload_date': '20130818',
'title': 'The Daily Show with Trevor Noah|August 28, 2020|25|25149|Doc Rivers and Steve Ballmer - The NBA Player Strike',
'description': 'md5:5334307c433892b85f4f5e5ac9ef7498',
'timestamp': 1598670000,
'upload_date': '20200829',
},
}, {
'url': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/interviews/6yx39d/exclusive-rand-paul-extended-interview',
'url': 'http://www.cc.com/episodes/pnzzci/drawn-together--american-idol--parody-clip-show-season-3-ep-314',
'only_matching': True,
}]
class ComedyCentralFullEpisodesIE(MTVServicesInfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/
(?:full-episodes|shows(?=/[^/]+/full-episodes))
/(?P<id>[^?]+)'''
_FEED_URL = 'http://comedycentral.com/feeds/mrss/'
_TESTS = [{
'url': 'http://www.cc.com/full-episodes/pv391a/the-daily-show-with-trevor-noah-november-28--2016---ryan-speedo-green-season-22-ep-22028',
'info_dict': {
'description': 'Donald Trump is accused of exploiting his president-elect status for personal gain, Cuban leader Fidel Castro dies, and Ryan Speedo Green discusses "Sing for Your Life."',
'title': 'November 28, 2016 - Ryan Speedo Green',
},
'playlist_count': 4,
}, {
'url': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes',
'only_matching': True,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
mgid = self._extract_mgid(webpage, url, data_zone='t2_lc_promo1')
videos_info = self._get_videos_info(mgid)
return videos_info
class ToshIE(MTVServicesInfoExtractor):
IE_DESC = 'Tosh.0'
_VALID_URL = r'^https?://tosh\.cc\.com/video-(?:clips|collections)/[^/]+/(?P<videotitle>[^/?#]+)'
_FEED_URL = 'http://tosh.cc.com/feeds/mrss'
_TESTS = [{
'url': 'http://tosh.cc.com/video-clips/68g93d/twitter-users-share-summer-plans',
'info_dict': {
'description': 'Tosh asked fans to share their summer plans.',
'title': 'Twitter Users Share Summer Plans',
},
'playlist': [{
'md5': 'f269e88114c1805bb6d7653fecea9e06',
'info_dict': {
'id': '90498ec2-ed00-11e0-aca6-0026b9414f30',
'ext': 'mp4',
'title': 'Tosh.0|June 9, 2077|2|211|Twitter Users Share Summer Plans',
'description': 'Tosh asked fans to share their summer plans.',
'thumbnail': r're:^https?://.*\.jpg',
# It's really reported to be published on year 2077
'upload_date': '20770610',
'timestamp': 3390510600,
'subtitles': {
'en': 'mincount:3',
},
},
}]
}, {
'url': 'http://tosh.cc.com/video-collections/x2iz7k/just-plain-foul/m5q4fp',
'url': 'https://www.cc.com/video/k3sdvm/the-daily-show-with-jon-stewart-exclusive-the-fourth-estate',
'only_matching': True,
}]
class ComedyCentralTVIE(MTVServicesInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?comedycentral\.tv/(?:staffeln|shows)/(?P<id>[^/?#&]+)'
_VALID_URL = r'https?://(?:www\.)?comedycentral\.tv/folgen/(?P<id>[0-9a-z]{6})'
_TESTS = [{
'url': 'http://www.comedycentral.tv/staffeln/7436-the-mindy-project-staffel-4',
'url': 'https://www.comedycentral.tv/folgen/pxdpec/josh-investigates-klimawandel-staffel-1-ep-1',
'info_dict': {
'id': 'local_playlist-f99b626bdfe13568579a',
'ext': 'flv',
'title': 'Episode_the-mindy-project_shows_season-4_episode-3_full-episode_part1',
'id': '15907dc3-ec3c-11e8-a442-0e40cf2fc285',
'ext': 'mp4',
'title': 'Josh Investigates',
'description': 'Steht uns das Ende der Welt bevor?',
},
'params': {
# rtmp download
'skip_download': True,
},
}, {
'url': 'http://www.comedycentral.tv/shows/1074-workaholics',
'only_matching': True,
}, {
'url': 'http://www.comedycentral.tv/shows/1727-the-mindy-project/bonus',
'only_matching': True,
}]
_FEED_URL = 'http://feeds.mtvnservices.com/od/feed/intl-mrss-player-feed'
_GEO_COUNTRIES = ['DE']
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
mrss_url = self._search_regex(
r'data-mrss=(["\'])(?P<url>(?:(?!\1).)+)\1',
webpage, 'mrss url', group='url')
return self._get_videos_info_from_url(mrss_url, video_id)
class ComedyCentralShortnameIE(InfoExtractor):
_VALID_URL = r'^:(?P<id>tds|thedailyshow|theopposition)$'
_TESTS = [{
'url': ':tds',
'only_matching': True,
}, {
'url': ':thedailyshow',
'only_matching': True,
}, {
'url': ':theopposition',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
shortcut_map = {
'tds': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes',
'thedailyshow': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes',
'theopposition': 'http://www.cc.com/shows/the-opposition-with-jordan-klepper/full-episodes',
def _get_feed_query(self, uri):
return {
'accountOverride': 'intl.mtvi.com',
'arcEp': 'web.cc.tv',
'ep': 'b9032c3a',
'imageEp': 'web.cc.tv',
'mgid': uri,
}
return self.url_result(shortcut_map[video_id])

View File

@@ -336,9 +336,8 @@ class InfoExtractor(object):
There must be a key "entries", which is a list, an iterable, or a PagedList
object, each element of which is a valid dictionary by this specification.
Additionally, playlists can have "id", "title", "description", "uploader",
"uploader_id", "uploader_url", "duration" attributes with the same semantics
as videos (see above).
Additionally, playlists can have "id", "title", and any other relevent
attributes with the same semantics as videos (see above).
_type "multi_video" indicates that there are multiple videos that
@@ -967,10 +966,11 @@ class InfoExtractor(object):
urls, playlist_id=playlist_id, playlist_title=playlist_title)
@staticmethod
def playlist_result(entries, playlist_id=None, playlist_title=None, playlist_description=None):
def playlist_result(entries, playlist_id=None, playlist_title=None, playlist_description=None, **kwargs):
"""Returns a playlist"""
video_info = {'_type': 'playlist',
'entries': entries}
video_info.update(kwargs)
if playlist_id:
video_info['id'] = playlist_id
if playlist_title:
@@ -1366,17 +1366,17 @@ class InfoExtractor(object):
class FormatSort:
regex = r' *((?P<reverse>\+)?(?P<field>[a-zA-Z0-9_]+)((?P<seperator>[~:])(?P<limit>.*?))?)? *$'
default = ('hidden', 'has_video', 'extractor', 'lang', 'quality',
'res', 'fps', 'codec', 'size', 'br', 'asr',
'proto', 'ext', 'has_audio', 'source', 'format_id')
default = ('hidden', 'hasvid', 'ie_pref', 'lang', 'quality',
'res', 'fps', 'codec:vp9', 'size', 'br', 'asr',
'proto', 'ext', 'has_audio', 'source', 'format_id') # These must not be aliases
settings = {
'vcodec': {'type': 'ordered', 'regex': True,
'order': ['vp9', '(h265|he?vc?)', '(h264|avc)', 'vp8', '(mp4v|h263)', 'theora', '', None, 'none']},
'order': ['av0?1', 'vp9', '(h265|he?vc?)', '(h264|avc)', 'vp8', '(mp4v|h263)', 'theora', '', None, 'none']},
'acodec': {'type': 'ordered', 'regex': True,
'order': ['opus', 'vorbis', 'aac', 'mp?4a?', 'mp3', 'e?a?c-?3', 'dts', '', None, 'none']},
'protocol': {'type': 'ordered', 'regex': True,
'order': ['(ht|f)tps', '(ht|f)tp$', 'm3u8.+', 'm3u8', '.*dash', '', 'mms|rtsp', 'none', 'f4']},
'proto': {'type': 'ordered', 'regex': True, 'field': 'protocol',
'order': ['(ht|f)tps', '(ht|f)tp$', 'm3u8.+', 'm3u8', '.*dash', '', 'mms|rtsp', 'none', 'f4']},
'vext': {'type': 'ordered', 'field': 'video_ext',
'order': ('mp4', 'webm', 'flv', '', 'none'),
'order_free': ('webm', 'mp4', 'flv', '', 'none')},
@@ -1384,14 +1384,14 @@ class InfoExtractor(object):
'order': ('m4a', 'aac', 'mp3', 'ogg', 'opus', 'webm', '', 'none'),
'order_free': ('opus', 'ogg', 'webm', 'm4a', 'mp3', 'aac', '', 'none')},
'hidden': {'visible': False, 'forced': True, 'type': 'extractor', 'max': -1000},
'extractor_preference': {'priority': True, 'type': 'extractor'},
'has_video': {'priority': True, 'field': 'vcodec', 'type': 'boolean', 'not_in_list': ('none',)},
'has_audio': {'field': 'acodec', 'type': 'boolean', 'not_in_list': ('none',)},
'language_preference': {'priority': True, 'convert': 'ignore'},
'ie_pref': {'priority': True, 'type': 'extractor', 'field': 'extractor_preference'},
'hasvid': {'priority': True, 'field': 'vcodec', 'type': 'boolean', 'not_in_list': ('none',)},
'hasaud': {'field': 'acodec', 'type': 'boolean', 'not_in_list': ('none',)},
'lang': {'priority': True, 'convert': 'ignore', 'field': 'language_preference'},
'quality': {'priority': True, 'convert': 'float_none'},
'filesize': {'convert': 'bytes'},
'filesize_approx': {'convert': 'bytes'},
'format_id': {'convert': 'string'},
'fs_approx': {'convert': 'bytes', 'field': 'filesize_approx'},
'id': {'convert': 'string', 'field': 'format_id'},
'height': {'convert': 'float_none'},
'width': {'convert': 'float_none'},
'fps': {'convert': 'float_none'},
@@ -1399,32 +1399,42 @@ class InfoExtractor(object):
'vbr': {'convert': 'float_none'},
'abr': {'convert': 'float_none'},
'asr': {'convert': 'float_none'},
'source_preference': {'convert': 'ignore'},
'source': {'convert': 'ignore', 'field': 'source_preference'},
'codec': {'type': 'combined', 'field': ('vcodec', 'acodec')},
'bitrate': {'type': 'combined', 'field': ('tbr', 'vbr', 'abr'), 'same_limit': True},
'filesize_estimate': {'type': 'combined', 'same_limit': True, 'field': ('filesize', 'filesize_approx')},
'extension': {'type': 'combined', 'field': ('vext', 'aext')},
'dimension': {'type': 'multiple', 'field': ('height', 'width'), 'function': min}, # not named as 'resolution' because such a field exists
'res': {'type': 'alias', 'field': 'dimension'},
'ext': {'type': 'alias', 'field': 'extension'},
'br': {'type': 'alias', 'field': 'bitrate'},
'br': {'type': 'combined', 'field': ('tbr', 'vbr', 'abr'), 'same_limit': True},
'size': {'type': 'combined', 'same_limit': True, 'field': ('filesize', 'fs_approx')},
'ext': {'type': 'combined', 'field': ('vext', 'aext')},
'res': {'type': 'multiple', 'field': ('height', 'width'), 'function': min},
# Most of these exist only for compatibility reasons
'dimension': {'type': 'alias', 'field': 'res'},
'resolution': {'type': 'alias', 'field': 'res'},
'extension': {'type': 'alias', 'field': 'ext'},
'bitrate': {'type': 'alias', 'field': 'br'},
'total_bitrate': {'type': 'alias', 'field': 'tbr'},
'video_bitrate': {'type': 'alias', 'field': 'vbr'},
'audio_bitrate': {'type': 'alias', 'field': 'abr'},
'framerate': {'type': 'alias', 'field': 'fps'},
'lang': {'type': 'alias', 'field': 'language_preference'}, # not named as 'language' because such a field exists
'proto': {'type': 'alias', 'field': 'protocol'},
'source': {'type': 'alias', 'field': 'source_preference'},
'size': {'type': 'alias', 'field': 'filesize_estimate'},
'language_preference': {'type': 'alias', 'field': 'lang'}, # not named as 'language' because such a field exists
'protocol': {'type': 'alias', 'field': 'proto'},
'source_preference': {'type': 'alias', 'field': 'source'},
'filesize_approx': {'type': 'alias', 'field': 'fs_approx'},
'filesize_estimate': {'type': 'alias', 'field': 'size'},
'samplerate': {'type': 'alias', 'field': 'asr'},
'video_ext': {'type': 'alias', 'field': 'vext'},
'audio_ext': {'type': 'alias', 'field': 'aext'},
'video_codec': {'type': 'alias', 'field': 'vcodec'},
'audio_codec': {'type': 'alias', 'field': 'acodec'},
'video': {'type': 'alias', 'field': 'has_video'},
'audio': {'type': 'alias', 'field': 'has_audio'},
'extractor': {'type': 'alias', 'field': 'extractor_preference'},
'preference': {'type': 'alias', 'field': 'extractor_preference'}}
'video': {'type': 'alias', 'field': 'hasvid'},
'has_video': {'type': 'alias', 'field': 'hasvid'},
'audio': {'type': 'alias', 'field': 'hasaud'},
'has_audio': {'type': 'alias', 'field': 'hasaud'},
'extractor': {'type': 'alias', 'field': 'ie_pref'},
'preference': {'type': 'alias', 'field': 'ie_pref'},
'extractor_preference': {'type': 'alias', 'field': 'ie_pref'},
'format_id': {'type': 'alias', 'field': 'id'},
}
_order = []
@@ -2254,7 +2264,7 @@ class InfoExtractor(object):
})
return entries
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}, data=None, headers={}, query={}):
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, data=None, headers={}, query={}):
res = self._download_xml_handle(
mpd_url, video_id,
note=note or 'Downloading MPD manifest',
@@ -2268,10 +2278,9 @@ class InfoExtractor(object):
mpd_base_url = base_url(urlh.geturl())
return self._parse_mpd_formats(
mpd_doc, mpd_id=mpd_id, mpd_base_url=mpd_base_url,
formats_dict=formats_dict, mpd_url=mpd_url)
mpd_doc, mpd_id, mpd_base_url, mpd_url)
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', formats_dict={}, mpd_url=None):
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', mpd_url=None):
"""
Parse formats from MPD manifest.
References:
@@ -2550,15 +2559,7 @@ class InfoExtractor(object):
else:
# Assuming direct URL to unfragmented media.
f['url'] = base_url
# According to [1, 5.3.5.2, Table 7, page 35] @id of Representation
# is not necessarily unique within a Period thus formats with
# the same `format_id` are quite possible. There are numerous examples
# of such manifests (see https://github.com/ytdl-org/youtube-dl/issues/15111,
# https://github.com/ytdl-org/youtube-dl/issues/13919)
full_info = formats_dict.get(representation_id, {}).copy()
full_info.update(f)
formats.append(full_info)
formats.append(f)
else:
self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
return formats

View File

@@ -12,7 +12,14 @@ from ..utils import (
)
class EggheadCourseIE(InfoExtractor):
class EggheadBaseIE(InfoExtractor):
def _call_api(self, path, video_id, resource, fatal=True):
return self._download_json(
'https://app.egghead.io/api/v1/' + path,
video_id, 'Downloading %s JSON' % resource, fatal=fatal)
class EggheadCourseIE(EggheadBaseIE):
IE_DESC = 'egghead.io course'
IE_NAME = 'egghead:course'
_VALID_URL = r'https://egghead\.io/courses/(?P<id>[^/?#&]+)'
@@ -28,10 +35,9 @@ class EggheadCourseIE(InfoExtractor):
def _real_extract(self, url):
playlist_id = self._match_id(url)
lessons = self._download_json(
'https://egghead.io/api/v1/series/%s/lessons' % playlist_id,
playlist_id, 'Downloading course lessons JSON')
series_path = 'series/' + playlist_id
lessons = self._call_api(
series_path + '/lessons', playlist_id, 'course lessons')
entries = []
for lesson in lessons:
@@ -44,9 +50,8 @@ class EggheadCourseIE(InfoExtractor):
entries.append(self.url_result(
lesson_url, ie=EggheadLessonIE.ie_key(), video_id=lesson_id))
course = self._download_json(
'https://egghead.io/api/v1/series/%s' % playlist_id,
playlist_id, 'Downloading course JSON', fatal=False) or {}
course = self._call_api(
series_path, playlist_id, 'course', False) or {}
playlist_id = course.get('id')
if playlist_id:
@@ -57,7 +62,7 @@ class EggheadCourseIE(InfoExtractor):
course.get('description'))
class EggheadLessonIE(InfoExtractor):
class EggheadLessonIE(EggheadBaseIE):
IE_DESC = 'egghead.io lesson'
IE_NAME = 'egghead:lesson'
_VALID_URL = r'https://egghead\.io/(?:api/v1/)?lessons/(?P<id>[^/?#&]+)'
@@ -74,7 +79,7 @@ class EggheadLessonIE(InfoExtractor):
'upload_date': '20161209',
'duration': 304,
'view_count': 0,
'tags': ['javascript', 'free'],
'tags': 'count:2',
},
'params': {
'skip_download': True,
@@ -88,8 +93,8 @@ class EggheadLessonIE(InfoExtractor):
def _real_extract(self, url):
display_id = self._match_id(url)
lesson = self._download_json(
'https://egghead.io/api/v1/lessons/%s' % display_id, display_id)
lesson = self._call_api(
'lessons/' + display_id, display_id, 'lesson')
lesson_id = compat_str(lesson['id'])
title = lesson['title']

View File

@@ -50,7 +50,10 @@ from .animelab import (
AnimeLabIE,
AnimeLabShowsIE,
)
from .americastestkitchen import AmericasTestKitchenIE
from .americastestkitchen import (
AmericasTestKitchenIE,
AmericasTestKitchenSeasonIE,
)
from .animeondemand import AnimeOnDemandIE
from .anvato import AnvatoIE
from .aol import AolIE
@@ -87,6 +90,11 @@ from .atvat import ATVAtIE
from .audimedia import AudiMediaIE
from .audioboom import AudioBoomIE
from .audiomack import AudiomackIE, AudiomackAlbumIE
from .audius import (
AudiusIE,
AudiusTrackIE,
AudiusPlaylistIE
)
from .awaan import (
AWAANIE,
AWAANVideoIE,
@@ -119,10 +127,12 @@ from .bigflix import BigflixIE
from .bild import BildIE
from .bilibili import (
BiliBiliIE,
BiliBiliSearchIE,
BiliBiliBangumiIE,
BilibiliAudioIE,
BilibiliAudioAlbumIE,
BiliBiliPlayerIE,
BilibiliChannelIE,
)
from .biobiochiletv import BioBioChileTVIE
from .bitchute import (
@@ -244,11 +254,8 @@ from .cnn import (
)
from .coub import CoubIE
from .comedycentral import (
ComedyCentralFullEpisodesIE,
ComedyCentralIE,
ComedyCentralShortnameIE,
ComedyCentralTVIE,
ToshIE,
)
from .commonmistakes import CommonMistakesIE, UnicodeBOMIE
from .commonprotocols import (
@@ -677,6 +684,16 @@ from .microsoftvirtualacademy import (
MicrosoftVirtualAcademyIE,
MicrosoftVirtualAcademyCourseIE,
)
from .mildom import (
MildomIE,
MildomVodIE,
MildomUserVodIE,
)
from .minds import (
MindsIE,
MindsChannelIE,
MindsGroupIE,
)
from .ministrygrid import MinistryGridIE
from .minoto import MinotoIE
from .miomio import MioMioIE
@@ -1157,6 +1174,10 @@ from .stitcher import StitcherIE
from .sport5 import Sport5IE
from .sportbox import SportBoxIE
from .sportdeutschland import SportDeutschlandIE
from .spotify import (
SpotifyIE,
SpotifyShowIE,
)
from .spreaker import (
SpreakerIE,
SpreakerPageIE,
@@ -1265,7 +1286,10 @@ from .toutv import TouTvIE
from .toypics import ToypicsUserIE, ToypicsIE
from .traileraddict import TrailerAddictIE
from .trilulilu import TriluliluIE
from .trovolive import TrovoLiveIE
from .trovo import (
TrovoIE,
TrovoVodIE,
)
from .trunews import TruNewsIE
from .trutv import TruTVIE
from .tube8 import Tube8IE
@@ -1284,6 +1308,7 @@ from .tv2 import (
TV2IE,
TV2ArticleIE,
KatsomoIE,
MTVUutisetArticleIE,
)
from .tv2dk import (
TV2DKIE,
@@ -1424,7 +1449,6 @@ from .vidme import (
VidmeUserIE,
VidmeUserLikesIE,
)
from .vidzi import VidziIE
from .vier import VierIE, VierVideosIE
from .viewlift import (
ViewLiftIE,
@@ -1484,6 +1508,7 @@ from .vrv import (
VRVSeriesIE,
)
from .vshare import VShareIE
from .vtm import VTMIE
from .medialaan import MedialaanIE
from .vube import VubeIE
from .vuclip import VuClipIE

View File

@@ -11,7 +11,7 @@ from ..utils import (
class FranceCultureIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?franceculture\.fr/emissions/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TEST = {
_TESTS = [{
'url': 'http://www.franceculture.fr/emissions/carnet-nomade/rendez-vous-au-pays-des-geeks',
'info_dict': {
'id': 'rendez-vous-au-pays-des-geeks',
@@ -20,10 +20,14 @@ class FranceCultureIE(InfoExtractor):
'title': 'Rendez-vous au pays des geeks',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20140301',
'timestamp': 1393642916,
'timestamp': 1393700400,
'vcodec': 'none',
}
}
}, {
# no thumbnail
'url': 'https://www.franceculture.fr/emissions/la-recherche-montre-en-main/la-recherche-montre-en-main-du-mercredi-10-octobre-2018',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
@@ -36,19 +40,19 @@ class FranceCultureIE(InfoExtractor):
</h1>|
<div[^>]+class="[^"]*?(?:title-zone-diffusion|heading-zone-(?:wrapper|player-button))[^"]*?"[^>]*>
).*?
(<button[^>]+data-asset-source="[^"]+"[^>]+>)
(<button[^>]+data-(?:url|asset-source)="[^"]+"[^>]+>)
''',
webpage, 'video data'))
video_url = video_data['data-asset-source']
title = video_data.get('data-asset-title') or self._og_search_title(webpage)
video_url = video_data.get('data-url') or video_data['data-asset-source']
title = video_data.get('data-asset-title') or video_data.get('data-diffusion-title') or self._og_search_title(webpage)
description = self._html_search_regex(
r'(?s)<div[^>]+class="intro"[^>]*>.*?<h2>(.+?)</h2>',
webpage, 'description', default=None)
thumbnail = self._search_regex(
r'(?s)<figure[^>]+itemtype="https://schema.org/ImageObject"[^>]*>.*?<img[^>]+(?:data-dejavu-)?src="([^"]+)"',
webpage, 'thumbnail', fatal=False)
webpage, 'thumbnail', default=None)
uploader = self._html_search_regex(
r'(?s)<span class="author">(.*?)</span>',
webpage, 'uploader', default=None)
@@ -64,6 +68,6 @@ class FranceCultureIE(InfoExtractor):
'ext': ext,
'vcodec': 'none' if ext == 'mp3' else None,
'uploader': uploader,
'timestamp': int_or_none(video_data.get('data-asset-created-date')),
'timestamp': int_or_none(video_data.get('data-start-time')) or int_or_none(video_data.get('data-asset-created-date')),
'duration': int_or_none(video_data.get('data-duration')),
}

View File

@@ -131,6 +131,7 @@ from .gedi import GediEmbedsIE
from .rcs import RCSEmbedsIE
from .bitchute import BitChuteIE
from .arcpublishing import ArcPublishingIE
from .medialaan import MedialaanIE
class GenericIE(InfoExtractor):
@@ -2224,6 +2225,20 @@ class GenericIE(InfoExtractor):
'duration': 1581,
},
},
{
# MyChannels SDK embed
# https://www.24kitchen.nl/populair/deskundige-dit-waarom-sommigen-gevoelig-zijn-voor-voedselallergieen
'url': 'https://www.demorgen.be/nieuws/burgemeester-rotterdam-richt-zich-in-videoboodschap-tot-relschoppers-voelt-het-goed~b0bcfd741/',
'md5': '90c0699c37006ef18e198c032d81739c',
'info_dict': {
'id': '194165',
'ext': 'mp4',
'title': 'Burgemeester Aboutaleb spreekt relschoppers toe',
'timestamp': 1611740340,
'upload_date': '20210127',
'duration': 159,
},
},
]
def report_following_redirect(self, new_url):
@@ -2463,6 +2478,9 @@ class GenericIE(InfoExtractor):
webpage = self._webpage_read_content(
full_response, url, video_id, prefix=first_bytes)
if '<title>DPG Media Privacy Gate</title>' in webpage:
webpage = self._download_webpage(url, video_id)
self.report_extraction(video_id)
# Is it an RSS feed, a SMIL file, an XSPF playlist or a MPD manifest?
@@ -2594,6 +2612,11 @@ class GenericIE(InfoExtractor):
if arc_urls:
return self.playlist_from_matches(arc_urls, video_id, video_title, ie=ArcPublishingIE.ie_key())
mychannels_urls = MedialaanIE._extract_urls(webpage)
if mychannels_urls:
return self.playlist_from_matches(
mychannels_urls, video_id, video_title, ie=MedialaanIE.ie_key())
# Look for embedded rtl.nl player
matches = re.findall(
r'<iframe[^>]+?src="((?:https?:)?//(?:(?:www|static)\.)?rtl\.nl/(?:system/videoplayer/[^"]+(?:video_)?)?embed[^"]+)"',

View File

@@ -7,6 +7,7 @@ from ..compat import compat_parse_qs
from ..utils import (
determine_ext,
ExtractorError,
get_element_by_class,
int_or_none,
lowercase_escape,
try_get,
@@ -237,7 +238,7 @@ class GoogleDriveIE(InfoExtractor):
if confirmation_webpage:
confirm = self._search_regex(
r'confirm=([^&"\']+)', confirmation_webpage,
'confirmation code', fatal=False)
'confirmation code', default=None)
if confirm:
confirmed_source_url = update_url_query(source_url, {
'confirm': confirm,
@@ -245,6 +246,11 @@ class GoogleDriveIE(InfoExtractor):
urlh = request_source_file(confirmed_source_url, 'confirmed source')
if urlh and urlh.headers.get('Content-Disposition'):
add_source_format(urlh)
else:
self.report_warning(
get_element_by_class('uc-error-subcaption', confirmation_webpage)
or get_element_by_class('uc-error-caption', confirmation_webpage)
or 'unable to extract confirmation code')
if not formats and reason:
raise ExtractorError(reason, expected=True)

View File

@@ -5,7 +5,10 @@ import functools
import json
from .common import InfoExtractor
from ..compat import compat_str
from ..compat import (
compat_str,
compat_urllib_parse_unquote,
)
from ..utils import (
determine_ext,
ExtractorError,
@@ -131,6 +134,9 @@ class LBRYIE(LBRYBaseIE):
}, {
'url': 'https://lbry.tv/$/download/Episode-1/e7d93d772bd87e2b62d5ab993c1c3ced86ebb396',
'only_matching': True,
}, {
'url': 'https://lbry.tv/@lacajadepandora:a/TRUMP-EST%C3%81-BIEN-PUESTO-con-Pilar-Baselga,-Carlos-Senra,-Luis-Palacios-(720p_30fps_H264-192kbit_AAC):1',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -139,6 +145,7 @@ class LBRYIE(LBRYBaseIE):
display_id = display_id.split('/', 2)[-1].replace('/', ':')
else:
display_id = display_id.replace(':', '#')
display_id = compat_urllib_parse_unquote(display_id)
uri = 'lbry://' + display_id
result = self._resolve_url(uri, display_id, 'stream')
result_value = result['value']

View File

@@ -2,268 +2,113 @@ from __future__ import unicode_literals
import re
from .gigya import GigyaBaseIE
from ..compat import compat_str
from .common import InfoExtractor
from ..utils import (
extract_attributes,
int_or_none,
parse_duration,
try_get,
unified_timestamp,
mimetype2ext,
parse_iso8601,
)
class MedialaanIE(GigyaBaseIE):
class MedialaanIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:www\.|nieuws\.)?
(?:
(?P<site_id>vtm|q2|vtmkzoom)\.be/
(?:
video(?:/[^/]+/id/|/?\?.*?\baid=)|
(?:[^/]+/)*
)
(?:embed\.)?mychannels.video/embed/|
embed\.mychannels\.video/(?:s(?:dk|cript)/)?production/|
(?:www\.)?(?:
(?:
7sur7|
demorgen|
hln|
joe|
qmusic
)\.be|
(?:
[abe]d|
bndestem|
destentor|
gelderlander|
pzc|
tubantia|
volkskrant
)\.nl
)/video/(?:[^/]+/)*[^/?&#]+~p
)
(?P<id>[^/?#&]+)
(?P<id>\d+)
'''
_NETRC_MACHINE = 'medialaan'
_APIKEY = '3_HZ0FtkMW_gOyKlqQzW5_0FHRC7Nd5XpXJZcDdXY4pk5eES2ZWmejRW5egwVm4ug-'
_SITE_TO_APP_ID = {
'vtm': 'vtm_watch',
'q2': 'q2',
'vtmkzoom': 'vtmkzoom',
}
_TESTS = [{
# vod
'url': 'http://vtm.be/video/volledige-afleveringen/id/vtm_20170219_VM0678361_vtmwatch',
'url': 'https://www.bndestem.nl/video/de-terugkeer-van-ally-de-aap-en-wie-vertrekt-er-nog-bij-nac~p193993',
'info_dict': {
'id': 'vtm_20170219_VM0678361_vtmwatch',
'id': '193993',
'ext': 'mp4',
'title': 'Allemaal Chris afl. 6',
'description': 'md5:4be86427521e7b07e0adb0c9c554ddb2',
'timestamp': 1487533280,
'upload_date': '20170219',
'duration': 2562,
'series': 'Allemaal Chris',
'season': 'Allemaal Chris',
'season_number': 1,
'season_id': '256936078124527',
'episode': 'Allemaal Chris afl. 6',
'episode_number': 6,
'episode_id': '256936078591527',
'title': 'De terugkeer van Ally de Aap en wie vertrekt er nog bij NAC?',
'timestamp': 1611663540,
'upload_date': '20210126',
'duration': 238,
},
'params': {
'skip_download': True,
},
'skip': 'Requires account credentials',
}, {
# clip
'url': 'http://vtm.be/video?aid=168332',
'info_dict': {
'id': '168332',
'ext': 'mp4',
'title': '"Veronique liegt!"',
'description': 'md5:1385e2b743923afe54ba4adc38476155',
'timestamp': 1489002029,
'upload_date': '20170308',
'duration': 96,
},
}, {
# vod
'url': 'http://vtm.be/video/volledige-afleveringen/id/257107153551000',
'url': 'https://www.gelderlander.nl/video/kanalen/degelderlander~c320/series/snel-nieuws~s984/noodbevel-in-doetinchem-politie-stuurt-mensen-centrum-uit~p194093',
'only_matching': True,
}, {
# vod
'url': 'http://vtm.be/video?aid=163157',
'url': 'https://embed.mychannels.video/sdk/production/193993?options=TFTFF_default',
'only_matching': True,
}, {
# vod
'url': 'http://www.q2.be/video/volledige-afleveringen/id/2be_20170301_VM0684442_q2',
'url': 'https://embed.mychannels.video/script/production/193993',
'only_matching': True,
}, {
# clip
'url': 'http://vtmkzoom.be/k3-dansstudio/een-nieuw-seizoen-van-k3-dansstudio',
'url': 'https://embed.mychannels.video/production/193993',
'only_matching': True,
}, {
# http/s redirect
'url': 'https://vtmkzoom.be/video?aid=45724',
'info_dict': {
'id': '257136373657000',
'ext': 'mp4',
'title': 'K3 Dansstudio Ushuaia afl.6',
},
'params': {
'skip_download': True,
},
'skip': 'Requires account credentials',
'url': 'https://mychannels.video/embed/193993',
'only_matching': True,
}, {
# nieuws.vtm.be
'url': 'https://nieuws.vtm.be/stadion/stadion/genk-nog-moeilijk-programma',
'url': 'https://embed.mychannels.video/embed/193993',
'only_matching': True,
}]
def _real_initialize(self):
self._logged_in = False
def _login(self):
username, password = self._get_login_info()
if username is None:
self.raise_login_required()
auth_data = {
'APIKey': self._APIKEY,
'sdk': 'js_6.1',
'format': 'json',
'loginID': username,
'password': password,
}
auth_info = self._gigya_login(auth_data)
self._uid = auth_info['UID']
self._uid_signature = auth_info['UIDSignature']
self._signature_timestamp = auth_info['signatureTimestamp']
self._logged_in = True
@staticmethod
def _extract_urls(webpage):
entries = []
for element in re.findall(r'(<div[^>]+data-mychannels-type="video"[^>]*>)', webpage):
mychannels_id = extract_attributes(element).get('data-mychannels-id')
if mychannels_id:
entries.append('https://mychannels.video/embed/' + mychannels_id)
return entries
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id, site_id = mobj.group('id', 'site_id')
production_id = self._match_id(url)
production = self._download_json(
'https://embed.mychannels.video/sdk/production/' + production_id,
production_id, query={'options': 'UUUU_default'})['productions'][0]
title = production['title']
webpage = self._download_webpage(url, video_id)
config = self._parse_json(
self._search_regex(
r'videoJSConfig\s*=\s*JSON\.parse\(\'({.+?})\'\);',
webpage, 'config', default='{}'), video_id,
transform_source=lambda s: s.replace(
'\\\\', '\\').replace(r'\"', '"').replace(r"\'", "'"))
vod_id = config.get('vodId') or self._search_regex(
(r'\\"vodId\\"\s*:\s*\\"(.+?)\\"',
r'"vodId"\s*:\s*"(.+?)"',
r'<[^>]+id=["\']vod-(\d+)'),
webpage, 'video_id', default=None)
# clip, no authentication required
if not vod_id:
player = self._parse_json(
self._search_regex(
r'vmmaplayer\(({.+?})\);', webpage, 'vmma player',
default=''),
video_id, transform_source=lambda s: '[%s]' % s, fatal=False)
if player:
video = player[-1]
if video['videoUrl'] in ('http', 'https'):
return self.url_result(video['url'], MedialaanIE.ie_key())
info = {
'id': video_id,
'url': video['videoUrl'],
'title': video['title'],
'thumbnail': video.get('imageUrl'),
'timestamp': int_or_none(video.get('createdDate')),
'duration': int_or_none(video.get('duration')),
}
formats = []
for source in (production.get('sources') or []):
src = source.get('src')
if not src:
continue
ext = mimetype2ext(source.get('type'))
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
src, production_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
else:
info = self._parse_html5_media_entries(
url, webpage, video_id, m3u8_id='hls')[0]
info.update({
'id': video_id,
'title': self._html_search_meta('description', webpage),
'duration': parse_duration(self._html_search_meta('duration', webpage)),
formats.append({
'ext': ext,
'url': src,
})
# vod, authentication required
else:
if not self._logged_in:
self._login()
self._sort_formats(formats)
settings = self._parse_json(
self._search_regex(
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
webpage, 'drupal settings', default='{}'),
video_id)
def get(container, item):
return try_get(
settings, lambda x: x[container][item],
compat_str) or self._search_regex(
r'"%s"\s*:\s*"([^"]+)' % item, webpage, item,
default=None)
app_id = get('vod', 'app_id') or self._SITE_TO_APP_ID.get(site_id, 'vtm_watch')
sso = get('vod', 'gigyaDatabase') or 'vtm-sso'
data = self._download_json(
'http://vod.medialaan.io/api/1.0/item/%s/video' % vod_id,
video_id, query={
'app_id': app_id,
'user_network': sso,
'UID': self._uid,
'UIDSignature': self._uid_signature,
'signatureTimestamp': self._signature_timestamp,
})
formats = self._extract_m3u8_formats(
data['response']['uri'], video_id, entry_protocol='m3u8_native',
ext='mp4', m3u8_id='hls')
self._sort_formats(formats)
info = {
'id': vod_id,
'formats': formats,
}
api_key = get('vod', 'apiKey')
channel = get('medialaanGigya', 'channel')
if api_key:
videos = self._download_json(
'http://vod.medialaan.io/vod/v2/videos', video_id, fatal=False,
query={
'channels': channel,
'ids': vod_id,
'limit': 1,
'apikey': api_key,
})
if videos:
video = try_get(
videos, lambda x: x['response']['videos'][0], dict)
if video:
def get(container, item, expected_type=None):
return try_get(
video, lambda x: x[container][item], expected_type)
def get_string(container, item):
return get(container, item, compat_str)
info.update({
'series': get_string('program', 'title'),
'season': get_string('season', 'title'),
'season_number': int_or_none(get('season', 'number')),
'season_id': get_string('season', 'id'),
'episode': get_string('episode', 'title'),
'episode_number': int_or_none(get('episode', 'number')),
'episode_id': get_string('episode', 'id'),
'duration': int_or_none(
video.get('duration')) or int_or_none(
video.get('durationMillis'), scale=1000),
'title': get_string('episode', 'title'),
'description': get_string('episode', 'text'),
'timestamp': unified_timestamp(get_string(
'publication', 'begin')),
})
if not info.get('title'):
info['title'] = try_get(
config, lambda x: x['videoConfig']['title'],
compat_str) or self._html_search_regex(
r'\\"title\\"\s*:\s*\\"(.+?)\\"', webpage, 'title',
default=None) or self._og_search_title(webpage)
if not info.get('description'):
info['description'] = self._html_search_regex(
r'<div[^>]+class="field-item\s+even">\s*<p>(.+?)</p>',
webpage, 'description', default=None)
return info
return {
'id': production_id,
'title': title,
'formats': formats,
'thumbnail': production.get('posterUrl'),
'timestamp': parse_iso8601(production.get('publicationDate'), ' '),
'duration': int_or_none(production.get('duration')) or None,
}

View File

@@ -0,0 +1,284 @@
# coding: utf-8
from __future__ import unicode_literals
from datetime import datetime
import itertools
import json
import base64
from .common import InfoExtractor
from ..utils import (
ExtractorError, std_headers,
update_url_query,
random_uuidv4,
try_get,
)
from ..compat import (
compat_urlparse,
compat_urllib_parse_urlencode,
compat_str,
)
class MildomBaseIE(InfoExtractor):
_GUEST_ID = None
_DISPATCHER_CONFIG = None
def _call_api(self, url, video_id, query={}, note='Downloading JSON metadata', init=False):
url = update_url_query(url, self._common_queries(query, init=init))
return self._download_json(url, video_id, note=note)['body']
def _common_queries(self, query={}, init=False):
dc = self._fetch_dispatcher_config()
r = {
'timestamp': self.iso_timestamp(),
'__guest_id': '' if init else self.guest_id(),
'__location': dc['location'],
'__country': dc['country'],
'__cluster': dc['cluster'],
'__platform': 'web',
'__la': self.lang_code(),
'__pcv': 'v2.9.44',
'sfr': 'pc',
'accessToken': '',
}
r.update(query)
return r
def _fetch_dispatcher_config(self):
if not self._DISPATCHER_CONFIG:
try:
tmp = self._download_json(
'https://disp.mildom.com/serverListV2', 'initialization',
note='Downloading dispatcher_config', data=json.dumps({
'protover': 0,
'data': base64.b64encode(json.dumps({
'fr': 'web',
'sfr': 'pc',
'devi': 'Windows',
'la': 'ja',
'gid': None,
'loc': '',
'clu': '',
'wh': '1919*810',
'rtm': self.iso_timestamp(),
'ua': std_headers['User-Agent'],
}).encode('utf8')).decode('utf8').replace('\n', ''),
}).encode('utf8'))
self._DISPATCHER_CONFIG = self._parse_json(base64.b64decode(tmp['data']), 'initialization')
except ExtractorError:
self._DISPATCHER_CONFIG = self._download_json(
'https://bookish-octo-barnacle.vercel.app/api/dispatcher_config', 'initialization',
note='Downloading dispatcher_config fallback')
return self._DISPATCHER_CONFIG
@staticmethod
def iso_timestamp():
'new Date().toISOString()'
return datetime.utcnow().isoformat()[0:-3] + 'Z'
def guest_id(self):
'getGuestId'
if self._GUEST_ID:
return self._GUEST_ID
self._GUEST_ID = try_get(
self, (
lambda x: x._call_api(
'https://cloudac.mildom.com/nonolive/gappserv/guest/h5init', 'initialization',
note='Downloading guest token', init=True)['guest_id'] or None,
lambda x: x._get_cookies('https://www.mildom.com').get('gid').value,
lambda x: x._get_cookies('https://m.mildom.com').get('gid').value,
), compat_str) or ''
return self._GUEST_ID
def lang_code(self):
'getCurrentLangCode'
return 'ja'
class MildomIE(MildomBaseIE):
IE_NAME = 'mildom'
IE_DESC = 'Record ongoing live by specific user in Mildom'
_VALID_URL = r'https?://(?:(?:www|m)\.)mildom\.com/(?P<id>\d+)'
def _real_extract(self, url):
video_id = self._match_id(url)
url = 'https://www.mildom.com/%s' % video_id
webpage = self._download_webpage(url, video_id)
enterstudio = self._call_api(
'https://cloudac.mildom.com/nonolive/gappserv/live/enterstudio', video_id,
note='Downloading live metadata', query={'user_id': video_id})
title = try_get(
enterstudio, (
lambda x: self._html_search_meta('twitter:description', webpage),
lambda x: x['anchor_intro'],
), compat_str)
description = try_get(
enterstudio, (
lambda x: x['intro'],
lambda x: x['live_intro'],
), compat_str)
uploader = try_get(
enterstudio, (
lambda x: self._html_search_meta('twitter:title', webpage),
lambda x: x['loginname'],
), compat_str)
servers = self._call_api(
'https://cloudac.mildom.com/nonolive/gappserv/live/liveserver', video_id,
note='Downloading live server list', query={
'user_id': video_id,
'live_server_type': 'hls',
})
stream_query = self._common_queries({
'streamReqId': random_uuidv4(),
'is_lhls': '0',
})
m3u8_url = update_url_query(servers['stream_server'] + '/%s_master.m3u8' % video_id, stream_query)
formats = self._extract_m3u8_formats(m3u8_url, video_id, 'mp4', headers={
'Referer': 'https://www.mildom.com/',
'Origin': 'https://www.mildom.com',
}, note='Downloading m3u8 information')
del stream_query['streamReqId'], stream_query['timestamp']
for fmt in formats:
# Uses https://github.com/nao20010128nao/bookish-octo-barnacle by @nao20010128nao as a proxy
parsed = compat_urlparse.urlparse(fmt['url'])
parsed = parsed._replace(
netloc='bookish-octo-barnacle.vercel.app',
query=compat_urllib_parse_urlencode(stream_query, True),
path='/api' + parsed.path)
fmt['url'] = compat_urlparse.urlunparse(parsed)
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'uploader': uploader,
'uploader_id': video_id,
'formats': formats,
'is_live': True,
}
class MildomVodIE(MildomBaseIE):
IE_NAME = 'mildom:vod'
IE_DESC = 'Download a VOD in Mildom'
_VALID_URL = r'https?://(?:(?:www|m)\.)mildom\.com/playback/(?P<user_id>\d+)/(?P<id>(?P=user_id)-[a-zA-Z0-9]+)'
def _real_extract(self, url):
video_id = self._match_id(url)
m = self._VALID_URL_RE.match(url)
user_id = m.group('user_id')
url = 'https://www.mildom.com/playback/%s/%s' % (user_id, video_id)
webpage = self._download_webpage(url, video_id)
autoplay = self._call_api(
'https://cloudac.mildom.com/nonolive/videocontent/playback/getPlaybackDetail', video_id,
note='Downloading playback metadata', query={
'v_id': video_id,
})['playback']
title = try_get(
autoplay, (
lambda x: self._html_search_meta('og:description', webpage),
lambda x: x['title'],
), compat_str)
description = try_get(
autoplay, (
lambda x: x['video_intro'],
), compat_str)
uploader = try_get(
autoplay, (
lambda x: x['author_info']['login_name'],
), compat_str)
audio_formats = [{
'url': autoplay['audio_url'],
'format_id': 'audio',
'protocol': 'm3u8_native',
'vcodec': 'none',
'acodec': 'aac',
}]
video_formats = []
for fmt in autoplay['video_link']:
video_formats.append({
'format_id': 'video-%s' % fmt['name'],
'url': fmt['url'],
'protocol': 'm3u8_native',
'width': fmt['level'] * autoplay['video_width'] // autoplay['video_height'],
'height': fmt['level'],
'vcodec': 'h264',
'acodec': 'aac',
})
stream_query = self._common_queries({
'is_lhls': '0',
})
del stream_query['timestamp']
formats = audio_formats + video_formats
for fmt in formats:
fmt['ext'] = 'mp4'
parsed = compat_urlparse.urlparse(fmt['url'])
stream_query['path'] = parsed.path[5:]
parsed = parsed._replace(
netloc='bookish-octo-barnacle.vercel.app',
query=compat_urllib_parse_urlencode(stream_query, True),
path='/api/vod2/proxy')
fmt['url'] = compat_urlparse.urlunparse(parsed)
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'uploader': uploader,
'uploader_id': user_id,
'formats': formats,
}
class MildomUserVodIE(MildomBaseIE):
IE_NAME = 'mildom:user:vod'
IE_DESC = 'Download all VODs from specific user in Mildom'
_VALID_URL = r'https?://(?:(?:www|m)\.)mildom\.com/profile/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.mildom.com/profile/10093333',
'info_dict': {
'id': '10093333',
'title': 'Uploads from ねこばたけ',
},
'playlist_mincount': 351,
}]
def _real_extract(self, url):
user_id = self._match_id(url)
self._downloader.report_warning('To download ongoing live, please use "https://www.mildom.com/%s" instead. This will list up VODs belonging to user.' % user_id)
profile = self._call_api(
'https://cloudac.mildom.com/nonolive/gappserv/user/profileV2', user_id,
query={'user_id': user_id}, note='Downloading user profile')['user_info']
results = []
for page in itertools.count(1):
reply = self._call_api(
'https://cloudac.mildom.com/nonolive/videocontent/profile/playbackList',
user_id, note='Downloading page %d' % page, query={
'user_id': user_id,
'page': page,
'limit': '30',
})
if not reply:
break
results.extend('https://www.mildom.com/playback/%s/%s' % (user_id, x['v_id']) for x in reply)
return self.playlist_result([
self.url_result(u, ie=MildomVodIE.ie_key()) for u in results
], user_id, 'Uploads from %s' % profile['loginname'])

View File

@@ -0,0 +1,196 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
clean_html,
int_or_none,
str_or_none,
strip_or_none,
)
class MindsBaseIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.)?minds\.com/'
def _call_api(self, path, video_id, resource, query=None):
api_url = 'https://www.minds.com/api/' + path
token = self._get_cookies(api_url).get('XSRF-TOKEN')
return self._download_json(
api_url, video_id, 'Downloading %s JSON metadata' % resource, headers={
'Referer': 'https://www.minds.com/',
'X-XSRF-TOKEN': token.value if token else '',
}, query=query)
class MindsIE(MindsBaseIE):
IE_NAME = 'minds'
_VALID_URL = MindsBaseIE._VALID_URL_BASE + r'(?:media|newsfeed|archive/view)/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://www.minds.com/media/100000000000086822',
'md5': '215a658184a419764852239d4970b045',
'info_dict': {
'id': '100000000000086822',
'ext': 'mp4',
'title': 'Minds intro sequence',
'thumbnail': r're:https?://.+\.png',
'uploader_id': 'ottman',
'upload_date': '20130524',
'timestamp': 1369404826,
'uploader': 'Bill Ottman',
'view_count': int,
'like_count': int,
'dislike_count': int,
'tags': ['animation'],
'comment_count': int,
'license': 'attribution-cc',
},
}, {
# entity.type == 'activity' and empty title
'url': 'https://www.minds.com/newsfeed/798025111988506624',
'md5': 'b2733a74af78d7fd3f541c4cbbaa5950',
'info_dict': {
'id': '798022190320226304',
'ext': 'mp4',
'title': '798022190320226304',
'uploader': 'ColinFlaherty',
'upload_date': '20180111',
'timestamp': 1515639316,
'uploader_id': 'ColinFlaherty',
},
}, {
'url': 'https://www.minds.com/archive/view/715172106794442752',
'only_matching': True,
}, {
# youtube perma_url
'url': 'https://www.minds.com/newsfeed/1197131838022602752',
'only_matching': True,
}]
def _real_extract(self, url):
entity_id = self._match_id(url)
entity = self._call_api(
'v1/entities/entity/' + entity_id, entity_id, 'entity')['entity']
if entity.get('type') == 'activity':
if entity.get('custom_type') == 'video':
video_id = entity['entity_guid']
else:
return self.url_result(entity['perma_url'])
else:
assert(entity['subtype'] == 'video')
video_id = entity_id
# 1080p and webm formats available only on the sources array
video = self._call_api(
'v2/media/video/' + video_id, video_id, 'video')
formats = []
for source in (video.get('sources') or []):
src = source.get('src')
if not src:
continue
formats.append({
'format_id': source.get('label'),
'height': int_or_none(source.get('size')),
'url': src,
})
self._sort_formats(formats)
entity = video.get('entity') or entity
owner = entity.get('ownerObj') or {}
uploader_id = owner.get('username')
tags = entity.get('tags')
if tags and isinstance(tags, compat_str):
tags = [tags]
thumbnail = None
poster = video.get('poster') or entity.get('thumbnail_src')
if poster:
urlh = self._request_webpage(poster, video_id, fatal=False)
if urlh:
thumbnail = urlh.geturl()
return {
'id': video_id,
'title': entity.get('title') or video_id,
'formats': formats,
'description': clean_html(entity.get('description')) or None,
'license': str_or_none(entity.get('license')),
'timestamp': int_or_none(entity.get('time_created')),
'uploader': strip_or_none(owner.get('name')),
'uploader_id': uploader_id,
'uploader_url': 'https://www.minds.com/' + uploader_id if uploader_id else None,
'view_count': int_or_none(entity.get('play:count')),
'like_count': int_or_none(entity.get('thumbs:up:count')),
'dislike_count': int_or_none(entity.get('thumbs:down:count')),
'tags': tags,
'comment_count': int_or_none(entity.get('comments:count')),
'thumbnail': thumbnail,
}
class MindsFeedBaseIE(MindsBaseIE):
_PAGE_SIZE = 150
def _entries(self, feed_id):
query = {'limit': self._PAGE_SIZE, 'sync': 1}
i = 1
while True:
data = self._call_api(
'v2/feeds/container/%s/videos' % feed_id,
feed_id, 'page %s' % i, query)
entities = data.get('entities') or []
for entity in entities:
guid = entity.get('guid')
if not guid:
continue
yield self.url_result(
'https://www.minds.com/newsfeed/' + guid,
MindsIE.ie_key(), guid)
query['from_timestamp'] = data['load-next']
if not (query['from_timestamp'] and len(entities) == self._PAGE_SIZE):
break
i += 1
def _real_extract(self, url):
feed_id = self._match_id(url)
feed = self._call_api(
'v1/%s/%s' % (self._FEED_PATH, feed_id),
feed_id, self._FEED_TYPE)[self._FEED_TYPE]
return self.playlist_result(
self._entries(feed['guid']), feed_id,
strip_or_none(feed.get('name')),
feed.get('briefdescription'))
class MindsChannelIE(MindsFeedBaseIE):
_FEED_TYPE = 'channel'
IE_NAME = 'minds:' + _FEED_TYPE
_VALID_URL = MindsBaseIE._VALID_URL_BASE + r'(?!(?:newsfeed|media|api|archive|groups)/)(?P<id>[^/?&#]+)'
_FEED_PATH = 'channel'
_TEST = {
'url': 'https://www.minds.com/ottman',
'info_dict': {
'id': 'ottman',
'title': 'Bill Ottman',
'description': 'Co-creator & CEO @minds',
},
'playlist_mincount': 54,
}
class MindsGroupIE(MindsFeedBaseIE):
_FEED_TYPE = 'group'
IE_NAME = 'minds:' + _FEED_TYPE
_VALID_URL = MindsBaseIE._VALID_URL_BASE + r'groups/profile/(?P<id>[0-9]+)'
_FEED_PATH = 'groups/group'
_TEST = {
'url': 'https://www.minds.com/groups/profile/785582576369672204/feed/videos',
'info_dict': {
'id': '785582576369672204',
'title': 'Cooking Videos',
},
'playlist_mincount': 1,
}

View File

@@ -255,6 +255,10 @@ class MTVServicesInfoExtractor(InfoExtractor):
return try_get(feed, lambda x: x['result']['data']['id'], compat_str)
@staticmethod
def _extract_child_with_type(parent, t):
return next(c for c in parent['children'] if c.get('type') == t)
def _extract_new_triforce_mgid(self, webpage, url='', video_id=None):
if url == '':
return
@@ -332,6 +336,13 @@ class MTVServicesInfoExtractor(InfoExtractor):
if not mgid:
mgid = self._extract_triforce_mgid(webpage, data_zone)
if not mgid:
data = self._parse_json(self._search_regex(
r'__DATA__\s*=\s*({.+?});', webpage, 'data'), None)
main_container = self._extract_child_with_type(data, 'MainContainer')
video_player = self._extract_child_with_type(main_container, 'VideoPlayer')
mgid = video_player['props']['media']['video']['config']['uri']
return mgid
def _real_extract(self, url):
@@ -403,18 +414,6 @@ class MTVIE(MTVServicesInfoExtractor):
'only_matching': True,
}]
@staticmethod
def extract_child_with_type(parent, t):
children = parent['children']
return next(c for c in children if c.get('type') == t)
def _extract_mgid(self, webpage):
data = self._parse_json(self._search_regex(
r'__DATA__\s*=\s*({.+?});', webpage, 'data'), None)
main_container = self.extract_child_with_type(data, 'MainContainer')
video_player = self.extract_child_with_type(main_container, 'VideoPlayer')
return video_player['props']['media']['video']['config']['uri']
class MTVJapanIE(MTVServicesInfoExtractor):
IE_NAME = 'mtvjapan'

View File

@@ -1,104 +1,125 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import str_to_int
from ..utils import (
determine_ext,
ExtractorError,
int_or_none,
try_get,
url_or_none,
)
class NineGagIE(InfoExtractor):
IE_NAME = '9gag'
_VALID_URL = r'https?://(?:www\.)?9gag(?:\.com/tv|\.tv)/(?:p|embed)/(?P<id>[a-zA-Z0-9]+)(?:/(?P<display_id>[^?#/]+))?'
_VALID_URL = r'https?://(?:www\.)?9gag\.com/gag/(?P<id>[^/?&#]+)'
_TESTS = [{
'url': 'http://9gag.com/tv/p/Kk2X5/people-are-awesome-2013-is-absolutely-awesome',
_TEST = {
'url': 'https://9gag.com/gag/ae5Ag7B',
'info_dict': {
'id': 'kXzwOKyGlSA',
'id': 'ae5Ag7B',
'ext': 'mp4',
'description': 'This 3-minute video will make you smile and then make you feel untalented and insignificant. Anyway, you should share this awesomeness. (Thanks, Dino!)',
'title': '\"People Are Awesome 2013\" Is Absolutely Awesome',
'uploader_id': 'UCdEH6EjDKwtTe-sO2f0_1XA',
'uploader': 'CompilationChannel',
'upload_date': '20131110',
'view_count': int,
},
'add_ie': ['Youtube'],
}, {
'url': 'http://9gag.com/tv/p/aKolP3',
'info_dict': {
'id': 'aKolP3',
'ext': 'mp4',
'title': 'This Guy Travelled 11 countries In 44 days Just To Make This Amazing Video',
'description': "I just saw more in 1 minute than I've seen in 1 year. This guy's video is epic!!",
'uploader_id': 'rickmereki',
'uploader': 'Rick Mereki',
'upload_date': '20110803',
'view_count': int,
},
'add_ie': ['Vimeo'],
}, {
'url': 'http://9gag.com/tv/p/KklwM',
'only_matching': True,
}, {
'url': 'http://9gag.tv/p/Kk2X5',
'only_matching': True,
}, {
'url': 'http://9gag.com/tv/embed/a5Dmvl',
'only_matching': True,
}]
_EXTERNAL_VIDEO_PROVIDER = {
'1': {
'url': '%s',
'ie_key': 'Youtube',
},
'2': {
'url': 'http://player.vimeo.com/video/%s',
'ie_key': 'Vimeo',
},
'3': {
'url': 'http://instagram.com/p/%s',
'ie_key': 'Instagram',
},
'4': {
'url': 'http://vine.co/v/%s',
'ie_key': 'Vine',
},
'title': 'Capybara Agility Training',
'upload_date': '20191108',
'timestamp': 1573237208,
'categories': ['Awesome'],
'tags': ['Weimaraner', 'American Pit Bull Terrier'],
'duration': 44,
'like_count': int,
'dislike_count': int,
'comment_count': int,
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
display_id = mobj.group('display_id') or video_id
post_id = self._match_id(url)
post = self._download_json(
'https://9gag.com/v1/post', post_id, query={
'id': post_id
})['data']['post']
webpage = self._download_webpage(url, display_id)
if post.get('type') != 'Animated':
raise ExtractorError(
'The given url does not contain a video',
expected=True)
post_view = self._parse_json(
self._search_regex(
r'var\s+postView\s*=\s*new\s+app\.PostView\({\s*post:\s*({.+?})\s*,\s*posts:\s*prefetchedCurrentPost',
webpage, 'post view'),
display_id)
title = post['title']
ie_key = None
source_url = post_view.get('sourceUrl')
if not source_url:
external_video_id = post_view['videoExternalId']
external_video_provider = post_view['videoExternalProvider']
source_url = self._EXTERNAL_VIDEO_PROVIDER[external_video_provider]['url'] % external_video_id
ie_key = self._EXTERNAL_VIDEO_PROVIDER[external_video_provider]['ie_key']
title = post_view['title']
description = post_view.get('description')
view_count = str_to_int(post_view.get('externalView'))
thumbnail = post_view.get('thumbnail_700w') or post_view.get('ogImageUrl') or post_view.get('thumbnail_300w')
duration = None
formats = []
thumbnails = []
for key, image in (post.get('images') or {}).items():
image_url = url_or_none(image.get('url'))
if not image_url:
continue
ext = determine_ext(image_url)
image_id = key.strip('image')
common = {
'url': image_url,
'width': int_or_none(image.get('width')),
'height': int_or_none(image.get('height')),
}
if ext in ('jpg', 'png'):
webp_url = image.get('webpUrl')
if webp_url:
t = common.copy()
t.update({
'id': image_id + '-webp',
'url': webp_url,
})
thumbnails.append(t)
common.update({
'id': image_id,
'ext': ext,
})
thumbnails.append(common)
elif ext in ('webm', 'mp4'):
if not duration:
duration = int_or_none(image.get('duration'))
common['acodec'] = 'none' if image.get('hasAudio') == 0 else None
for vcodec in ('vp8', 'vp9', 'h265'):
c_url = image.get(vcodec + 'Url')
if not c_url:
continue
c_f = common.copy()
c_f.update({
'format_id': image_id + '-' + vcodec,
'url': c_url,
'vcodec': vcodec,
})
formats.append(c_f)
common.update({
'ext': ext,
'format_id': image_id,
})
formats.append(common)
self._sort_formats(formats)
section = try_get(post, lambda x: x['postSection']['name'])
tags = None
post_tags = post.get('tags')
if post_tags:
tags = []
for tag in post_tags:
tag_key = tag.get('key')
if not tag_key:
continue
tags.append(tag_key)
get_count = lambda x: int_or_none(post.get(x + 'Count'))
return {
'_type': 'url_transparent',
'url': source_url,
'ie_key': ie_key,
'id': video_id,
'display_id': display_id,
'id': post_id,
'title': title,
'description': description,
'view_count': view_count,
'thumbnail': thumbnail,
'timestamp': int_or_none(post.get('creationTs')),
'duration': duration,
'formats': formats,
'thumbnails': thumbnails,
'like_count': get_count('upVote'),
'dislike_count': get_count('downVote'),
'comment_count': get_count('comments'),
'age_limit': 18 if post.get('nsfw') == 1 else None,
'categories': [section] if section else None,
'tags': tags,
}

View File

@@ -6,30 +6,40 @@ import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
extract_attributes,
get_element_by_class,
urlencode_postdata,
)
class NJPWWorldIE(InfoExtractor):
_VALID_URL = r'https?://njpwworld\.com/p/(?P<id>[a-z0-9_]+)'
_VALID_URL = r'https?://(front\.)?njpwworld\.com/p/(?P<id>[a-z0-9_]+)'
IE_DESC = '新日本プロレスワールド'
_NETRC_MACHINE = 'njpwworld'
_TEST = {
_TESTS = [{
'url': 'http://njpwworld.com/p/s_series_00155_1_9/',
'info_dict': {
'id': 's_series_00155_1_9',
'ext': 'mp4',
'title': '第9試合 ランディ・サベージ vs リック・スタイナー',
'title': '闘強導夢2000 2000年1月4日 東京ドーム 第9試合 ランディ・サベージ VS リック・スタイナー',
'tags': list,
},
'params': {
'skip_download': True, # AES-encrypted m3u8
},
'skip': 'Requires login',
}
}, {
'url': 'https://front.njpwworld.com/p/s_series_00563_16_bs',
'info_dict': {
'id': 's_series_00563_16_bs',
'ext': 'mp4',
'title': 'WORLD TAG LEAGUE 2020 & BEST OF THE SUPER Jr.27 2020年12月6日 福岡・福岡国際センター バックステージコメント(字幕あり)',
'tags': ["福岡・福岡国際センター", "バックステージコメント", "2020", "20年代"],
},
'params': {
'skip_download': True,
},
}]
_LOGIN_URL = 'https://front.njpwworld.com/auth/login'
@@ -64,35 +74,27 @@ class NJPWWorldIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
formats = []
for mobj in re.finditer(r'<a[^>]+\bhref=(["\'])/player.+?[^>]*>', webpage):
player = extract_attributes(mobj.group(0))
player_path = player.get('href')
if not player_path:
continue
kind = self._search_regex(
r'(low|high)$', player.get('class') or '', 'kind',
default='low')
for kind, vid in re.findall(r'if\s+\(\s*imageQualityType\s*==\s*\'([^\']+)\'\s*\)\s*{\s*video_id\s*=\s*"(\d+)"', webpage):
player_path = '/intent?id=%s&type=url' % vid
player_url = compat_urlparse.urljoin(url, player_path)
player_page = self._download_webpage(
player_url, video_id, note='Downloading player page')
entries = self._parse_html5_media_entries(
player_url, player_page, video_id, m3u8_id='hls-%s' % kind,
m3u8_entry_protocol='m3u8_native')
kind_formats = entries[0]['formats']
for f in kind_formats:
f['quality'] = 2 if kind == 'high' else 1
formats.extend(kind_formats)
formats.append({
'url': player_url,
'format_id': kind,
'ext': 'mp4',
'protocol': 'm3u8',
'quality': 2 if kind == 'high' else 1,
})
self._sort_formats(formats)
post_content = get_element_by_class('post-content', webpage)
tag_block = get_element_by_class('tag-block', webpage)
tags = re.findall(
r'<li[^>]+class="tag-[^"]+"><a[^>]*>([^<]+)</a></li>', post_content
) if post_content else None
r'<a[^>]+class="tag-[^"]+"[^>]*>([^<]+)</a>', tag_block
) if tag_block else None
return {
'id': video_id,
'title': self._og_search_title(webpage),
'title': get_element_by_class('article-title', webpage) or self._og_search_title(webpage),
'formats': formats,
'tags': tags,
}

View File

@@ -22,11 +22,15 @@ from ..utils import (
orderedSet,
remove_quotes,
str_to_int,
update_url_query,
urlencode_postdata,
url_or_none,
)
class PornHubBaseIE(InfoExtractor):
_NETRC_MACHINE = 'pornhub'
def _download_webpage_handle(self, *args, **kwargs):
def dl(*args, **kwargs):
return super(PornHubBaseIE, self)._download_webpage_handle(*args, **kwargs)
@@ -52,6 +56,66 @@ class PornHubBaseIE(InfoExtractor):
return webpage, urlh
def _real_initialize(self):
self._logged_in = False
def _login(self, host):
if self._logged_in:
return
site = host.split('.')[0]
# Both sites pornhub and pornhubpremium have separate accounts
# so there should be an option to provide credentials for both.
# At the same time some videos are available under the same video id
# on both sites so that we have to identify them as the same video.
# For that purpose we have to keep both in the same extractor
# but under different netrc machines.
username, password = self._get_login_info(netrc_machine=site)
if username is None:
return
login_url = 'https://www.%s/%slogin' % (host, 'premium/' if 'premium' in host else '')
login_page = self._download_webpage(
login_url, None, 'Downloading %s login page' % site)
def is_logged(webpage):
return any(re.search(p, webpage) for p in (
r'class=["\']signOut',
r'>Sign\s+[Oo]ut\s*<'))
if is_logged(login_page):
self._logged_in = True
return
login_form = self._hidden_inputs(login_page)
login_form.update({
'username': username,
'password': password,
})
response = self._download_json(
'https://www.%s/front/authenticate' % host, None,
'Logging in to %s' % site,
data=urlencode_postdata(login_form),
headers={
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'Referer': login_url,
'X-Requested-With': 'XMLHttpRequest',
})
if response.get('success') == '1':
self._logged_in = True
return
message = response.get('message')
if message is not None:
raise ExtractorError(
'Unable to login: %s' % message, expected=True)
raise ExtractorError('Unable to log in')
class PornHubIE(PornHubBaseIE):
IE_DESC = 'PornHub and Thumbzilla'
@@ -163,12 +227,20 @@ class PornHubIE(PornHubBaseIE):
}, {
'url': 'https://www.pornhubpremium.com/view_video.php?viewkey=ph5e4acdae54a82',
'only_matching': True,
}, {
# Some videos are available with the same id on both premium
# and non-premium sites (e.g. this and the following test)
'url': 'https://www.pornhub.com/view_video.php?viewkey=ph5f75b0f4b18e3',
'only_matching': True,
}, {
'url': 'https://www.pornhubpremium.com/view_video.php?viewkey=ph5f75b0f4b18e3',
'only_matching': True,
}]
@staticmethod
def _extract_urls(webpage):
return re.findall(
r'<iframe[^>]+?src=["\'](?P<url>(?:https?:)?//(?:www\.)?pornhub\.(?:com|net|org)/embed/[\da-z]+)',
r'<iframe[^>]+?src=["\'](?P<url>(?:https?:)?//(?:www\.)?pornhub(?:premium)?\.(?:com|net|org)/embed/[\da-z]+)',
webpage)
def _extract_count(self, pattern, webpage, name):
@@ -180,12 +252,7 @@ class PornHubIE(PornHubBaseIE):
host = mobj.group('host') or 'pornhub.com'
video_id = mobj.group('id')
if 'premium' in host:
if not self._downloader.params.get('cookiefile'):
raise ExtractorError(
'PornHub Premium requires authentication.'
' You may want to use --cookies.',
expected=True)
self._login(host)
self._set_cookie(host, 'age_verified', '1')
@@ -405,6 +472,10 @@ class PornHubIE(PornHubBaseIE):
class PornHubPlaylistBaseIE(PornHubBaseIE):
def _extract_page(self, url):
return int_or_none(self._search_regex(
r'\bpage=(\d+)', url, 'page', default=None))
def _extract_entries(self, webpage, host):
# Only process container div with main playlist content skipping
# drop-down menu that uses similar pattern for videos (see
@@ -422,26 +493,6 @@ class PornHubPlaylistBaseIE(PornHubBaseIE):
container))
]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
host = mobj.group('host')
playlist_id = mobj.group('id')
webpage = self._download_webpage(url, playlist_id)
entries = self._extract_entries(webpage, host)
playlist = self._parse_json(
self._search_regex(
r'(?:playlistObject|PLAYLIST_VIEW)\s*=\s*({.+?});', webpage,
'playlist', default='{}'),
playlist_id, fatal=False)
title = playlist.get('title') or self._search_regex(
r'>Videos\s+in\s+(.+?)\s+[Pp]laylist<', webpage, 'title', fatal=False)
return self.playlist_result(
entries, playlist_id, title, playlist.get('description'))
class PornHubUserIE(PornHubPlaylistBaseIE):
_VALID_URL = r'(?P<url>https?://(?:[^/]+\.)?(?P<host>pornhub(?:premium)?\.(?:com|net|org))/(?:(?:user|channel)s|model|pornstar)/(?P<id>[^/?#&]+))(?:[?#&]|/(?!videos)|$)'
@@ -463,14 +514,27 @@ class PornHubUserIE(PornHubPlaylistBaseIE):
}, {
'url': 'https://www.pornhub.com/model/zoe_ph?abc=1',
'only_matching': True,
}, {
# Unavailable via /videos page, but available with direct pagination
# on pornstar page (see [1]), requires premium
# 1. https://github.com/ytdl-org/youtube-dl/issues/27853
'url': 'https://www.pornhubpremium.com/pornstar/sienna-west',
'only_matching': True,
}, {
# Same as before, multi page
'url': 'https://www.pornhubpremium.com/pornstar/lily-labeau',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
user_id = mobj.group('id')
videos_url = '%s/videos' % mobj.group('url')
page = self._extract_page(url)
if page:
videos_url = update_url_query(videos_url, {'page': page})
return self.url_result(
'%s/videos' % mobj.group('url'), ie=PornHubPagedVideoListIE.ie_key(),
video_id=user_id)
videos_url, ie=PornHubPagedVideoListIE.ie_key(), video_id=user_id)
class PornHubPagedPlaylistBaseIE(PornHubPlaylistBaseIE):
@@ -483,32 +547,55 @@ class PornHubPagedPlaylistBaseIE(PornHubPlaylistBaseIE):
<button[^>]+\bid=["\']moreDataBtn
''', webpage) is not None
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
host = mobj.group('host')
item_id = mobj.group('id')
def _entries(self, url, host, item_id):
page = self._extract_page(url)
page = int_or_none(self._search_regex(
r'\bpage=(\d+)', url, 'page', default=None))
VIDEOS = '/videos'
entries = []
for page_num in (page, ) if page is not None else itertools.count(1):
def download_page(base_url, num, fallback=False):
note = 'Downloading page %d%s' % (num, ' (switch to fallback)' if fallback else '')
return self._download_webpage(
base_url, item_id, note, query={'page': num})
def is_404(e):
return isinstance(e.cause, compat_HTTPError) and e.cause.code == 404
base_url = url
has_page = page is not None
first_page = page if has_page else 1
for page_num in (first_page, ) if has_page else itertools.count(first_page):
try:
webpage = self._download_webpage(
url, item_id, 'Downloading page %d' % page_num,
query={'page': page_num})
try:
webpage = download_page(base_url, page_num)
except ExtractorError as e:
# Some sources may not be available via /videos page,
# trying to fallback to main page pagination (see [1])
# 1. https://github.com/ytdl-org/youtube-dl/issues/27853
if is_404(e) and page_num == first_page and VIDEOS in base_url:
base_url = base_url.replace(VIDEOS, '')
webpage = download_page(base_url, page_num, fallback=True)
else:
raise
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
if is_404(e) and page_num != first_page:
break
raise
page_entries = self._extract_entries(webpage, host)
if not page_entries:
break
entries.extend(page_entries)
for e in page_entries:
yield e
if not self._has_more(webpage):
break
return self.playlist_result(orderedSet(entries), item_id)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
host = mobj.group('host')
item_id = mobj.group('id')
self._login(host)
return self.playlist_result(self._entries(url, host, item_id), item_id)
class PornHubPagedVideoListIE(PornHubPagedPlaylistBaseIE):

View File

@@ -20,19 +20,6 @@ class BellatorIE(MTVServicesInfoExtractor):
_FEED_URL = 'http://www.bellator.com/feeds/mrss/'
_GEO_COUNTRIES = ['US']
def _extract_mgid(self, webpage, url):
mgid = None
if not mgid:
mgid = self._extract_triforce_mgid(webpage)
if not mgid:
mgid = self._extract_new_triforce_mgid(webpage, url)
return mgid
# TODO Remove - Reason: Outdated Site
class ParamountNetworkIE(MTVServicesInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?paramountnetwork\.com/[^/]+/[\da-z]{6}(?:[/?#&]|$)'
@@ -56,16 +43,6 @@ class ParamountNetworkIE(MTVServicesInfoExtractor):
def _get_feed_query(self, uri):
return {
'arcEp': 'paramountnetwork.com',
'imageEp': 'paramountnetwork.com',
'mgid': uri,
}
def _extract_mgid(self, webpage, url):
root_data = self._parse_json(self._search_regex(
r'window\.__DATA__\s*=\s*({.+})',
webpage, 'data'), None)
def find_sub_data(data, data_type):
return next(c for c in data['children'] if c.get('type') == data_type)
c = find_sub_data(find_sub_data(root_data, 'MainContainer'), 'VideoPlayer')
return c['props']['media']['video']['config']['uri']

View File

@@ -0,0 +1,156 @@
# coding: utf-8
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
from ..utils import (
clean_podcast_url,
float_or_none,
int_or_none,
strip_or_none,
try_get,
unified_strdate,
)
class SpotifyBaseIE(InfoExtractor):
_ACCESS_TOKEN = None
_OPERATION_HASHES = {
'Episode': '8276d4423d709ae9b68ec1b74cc047ba0f7479059a37820be730f125189ac2bf',
'MinimalShow': '13ee079672fad3f858ea45a55eb109553b4fb0969ed793185b2e34cbb6ee7cc0',
'ShowEpisodes': 'e0e5ce27bd7748d2c59b4d44ba245a8992a05be75d6fabc3b20753fc8857444d',
}
_VALID_URL_TEMPL = r'https?://open\.spotify\.com/%s/(?P<id>[^/?&#]+)'
def _real_initialize(self):
self._ACCESS_TOKEN = self._download_json(
'https://open.spotify.com/get_access_token', None)['accessToken']
def _call_api(self, operation, video_id, variables):
return self._download_json(
'https://api-partner.spotify.com/pathfinder/v1/query', video_id, query={
'operationName': 'query' + operation,
'variables': json.dumps(variables),
'extensions': json.dumps({
'persistedQuery': {
'sha256Hash': self._OPERATION_HASHES[operation],
},
})
}, headers={'authorization': 'Bearer ' + self._ACCESS_TOKEN})['data']
def _extract_episode(self, episode, series):
episode_id = episode['id']
title = episode['name'].strip()
formats = []
audio_preview = episode.get('audioPreview') or {}
audio_preview_url = audio_preview.get('url')
if audio_preview_url:
f = {
'url': audio_preview_url.replace('://p.scdn.co/mp3-preview/', '://anon-podcast.scdn.co/'),
'vcodec': 'none',
}
audio_preview_format = audio_preview.get('format')
if audio_preview_format:
f['format_id'] = audio_preview_format
mobj = re.match(r'([0-9A-Z]{3})_(?:[A-Z]+_)?(\d+)', audio_preview_format)
if mobj:
f.update({
'abr': int(mobj.group(2)),
'ext': mobj.group(1).lower(),
})
formats.append(f)
for item in (try_get(episode, lambda x: x['audio']['items']) or []):
item_url = item.get('url')
if not (item_url and item.get('externallyHosted')):
continue
formats.append({
'url': clean_podcast_url(item_url),
'vcodec': 'none',
})
thumbnails = []
for source in (try_get(episode, lambda x: x['coverArt']['sources']) or []):
source_url = source.get('url')
if not source_url:
continue
thumbnails.append({
'url': source_url,
'width': int_or_none(source.get('width')),
'height': int_or_none(source.get('height')),
})
return {
'id': episode_id,
'title': title,
'formats': formats,
'thumbnails': thumbnails,
'description': strip_or_none(episode.get('description')),
'duration': float_or_none(try_get(
episode, lambda x: x['duration']['totalMilliseconds']), 1000),
'release_date': unified_strdate(try_get(
episode, lambda x: x['releaseDate']['isoString'])),
'series': series,
}
class SpotifyIE(SpotifyBaseIE):
IE_NAME = 'spotify'
_VALID_URL = SpotifyBaseIE._VALID_URL_TEMPL % 'episode'
_TEST = {
'url': 'https://open.spotify.com/episode/4Z7GAJ50bgctf6uclHlWKo',
'md5': '74010a1e3fa4d9e1ab3aa7ad14e42d3b',
'info_dict': {
'id': '4Z7GAJ50bgctf6uclHlWKo',
'ext': 'mp3',
'title': 'From the archive: Why time management is ruining our lives',
'description': 'md5:b120d9c4ff4135b42aa9b6d9cde86935',
'duration': 2083.605,
'release_date': '20201217',
'series': "The Guardian's Audio Long Reads",
}
}
def _real_extract(self, url):
episode_id = self._match_id(url)
episode = self._call_api('Episode', episode_id, {
'uri': 'spotify:episode:' + episode_id
})['episode']
return self._extract_episode(
episode, try_get(episode, lambda x: x['podcast']['name']))
class SpotifyShowIE(SpotifyBaseIE):
IE_NAME = 'spotify:show'
_VALID_URL = SpotifyBaseIE._VALID_URL_TEMPL % 'show'
_TEST = {
'url': 'https://open.spotify.com/show/4PM9Ke6l66IRNpottHKV9M',
'info_dict': {
'id': '4PM9Ke6l66IRNpottHKV9M',
'title': 'The Story from the Guardian',
'description': 'The Story podcast is dedicated to our finest audio documentaries, investigations and long form stories',
},
'playlist_mincount': 36,
}
def _real_extract(self, url):
show_id = self._match_id(url)
podcast = self._call_api('ShowEpisodes', show_id, {
'limit': 1000000000,
'offset': 0,
'uri': 'spotify:show:' + show_id,
})['podcast']
podcast_name = podcast.get('name')
entries = []
for item in (try_get(podcast, lambda x: x['episodes']['items']) or []):
episode = item.get('episode')
if not episode:
continue
entries.append(self._extract_episode(episode, podcast_name))
return self.playlist_result(
entries, show_id, podcast_name, podcast.get('description'))

View File

@@ -255,8 +255,10 @@ class SVTPlayIE(SVTPlayBaseIE):
svt_id = self._search_regex(
(r'<video[^>]+data-video-id=["\']([\da-zA-Z-]+)',
r'["\']videoSvtId["\']\s*:\s*["\']([\da-zA-Z-]+)',
r'["\']videoSvtId\\?["\']\s*:\s*\\?["\']([\da-zA-Z-]+)',
r'"content"\s*:\s*{.*?"id"\s*:\s*"([\da-zA-Z-]+)"',
r'["\']svtId["\']\s*:\s*["\']([\da-zA-Z-]+)'),
r'["\']svtId["\']\s*:\s*["\']([\da-zA-Z-]+)',
r'["\']svtId\\?["\']\s*:\s*\\?["\']([\da-zA-Z-]+)'),
webpage, 'video id')
info_dict = self._extract_by_video_id(svt_id, webpage)

View File

@@ -0,0 +1,193 @@
# coding: utf-8
from __future__ import unicode_literals
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
str_or_none,
try_get,
)
class TrovoBaseIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.)?trovo\.live/'
def _extract_streamer_info(self, data):
streamer_info = data.get('streamerInfo') or {}
username = streamer_info.get('userName')
return {
'uploader': streamer_info.get('nickName'),
'uploader_id': str_or_none(streamer_info.get('uid')),
'uploader_url': 'https://trovo.live/' + username if username else None,
}
class TrovoIE(TrovoBaseIE):
_VALID_URL = TrovoBaseIE._VALID_URL_BASE + r'(?!(?:clip|video)/)(?P<id>[^/?&#]+)'
def _real_extract(self, url):
username = self._match_id(url)
live_info = self._download_json(
'https://gql.trovo.live/', username, query={
'query': '''{
getLiveInfo(params: {userName: "%s"}) {
isLive
programInfo {
coverUrl
id
streamInfo {
desc
playUrl
}
title
}
streamerInfo {
nickName
uid
userName
}
}
}''' % username,
})['data']['getLiveInfo']
if live_info.get('isLive') == 0:
raise ExtractorError('%s is offline' % username, expected=True)
program_info = live_info['programInfo']
program_id = program_info['id']
title = self._live_title(program_info['title'])
formats = []
for stream_info in (program_info.get('streamInfo') or []):
play_url = stream_info.get('playUrl')
if not play_url:
continue
format_id = stream_info.get('desc')
formats.append({
'format_id': format_id,
'height': int_or_none(format_id[:-1]) if format_id else None,
'url': play_url,
})
self._sort_formats(formats)
info = {
'id': program_id,
'title': title,
'formats': formats,
'thumbnail': program_info.get('coverUrl'),
'is_live': True,
}
info.update(self._extract_streamer_info(live_info))
return info
class TrovoVodIE(TrovoBaseIE):
_VALID_URL = TrovoBaseIE._VALID_URL_BASE + r'(?:clip|video)/(?P<id>[^/?&#]+)'
_TESTS = [{
'url': 'https://trovo.live/video/ltv-100095501_100095501_1609596043',
'info_dict': {
'id': 'ltv-100095501_100095501_1609596043',
'ext': 'mp4',
'title': 'Spontaner 12 Stunden Stream! - Ok Boomer!',
'uploader': 'Exsl',
'timestamp': 1609640305,
'upload_date': '20210103',
'uploader_id': '100095501',
'duration': 43977,
'view_count': int,
'like_count': int,
'comment_count': int,
'comments': 'mincount:8',
'categories': ['Grand Theft Auto V'],
},
}, {
'url': 'https://trovo.live/clip/lc-5285890810184026005',
'only_matching': True,
}]
def _real_extract(self, url):
vid = self._match_id(url)
resp = self._download_json(
'https://gql.trovo.live/', vid, data=json.dumps([{
'query': '''{
batchGetVodDetailInfo(params: {vids: ["%s"]}) {
VodDetailInfos
}
}''' % vid,
}, {
'query': '''{
getCommentList(params: {appInfo: {postID: "%s"}, pageSize: 1000000000, preview: {}}) {
commentList {
author {
nickName
uid
}
commentID
content
createdAt
parentID
}
}
}''' % vid,
}]).encode(), headers={
'Content-Type': 'application/json',
})
vod_detail_info = resp[0]['data']['batchGetVodDetailInfo']['VodDetailInfos'][vid]
vod_info = vod_detail_info['vodInfo']
title = vod_info['title']
language = vod_info.get('languageName')
formats = []
for play_info in (vod_info.get('playInfos') or []):
play_url = play_info.get('playUrl')
if not play_url:
continue
format_id = play_info.get('desc')
formats.append({
'ext': 'mp4',
'filesize': int_or_none(play_info.get('fileSize')),
'format_id': format_id,
'height': int_or_none(format_id[:-1]) if format_id else None,
'language': language,
'protocol': 'm3u8_native',
'tbr': int_or_none(play_info.get('bitrate')),
'url': play_url,
})
self._sort_formats(formats)
category = vod_info.get('categoryName')
get_count = lambda x: int_or_none(vod_info.get(x + 'Num'))
comment_list = try_get(resp, lambda x: x[1]['data']['getCommentList']['commentList'], list) or []
comments = []
for comment in comment_list:
content = comment.get('content')
if not content:
continue
author = comment.get('author') or {}
parent = comment.get('parentID')
comments.append({
'author': author.get('nickName'),
'author_id': str_or_none(author.get('uid')),
'id': str_or_none(comment.get('commentID')),
'text': content,
'timestamp': int_or_none(comment.get('createdAt')),
'parent': 'root' if parent == 0 else str_or_none(parent),
})
info = {
'id': vid,
'title': title,
'formats': formats,
'thumbnail': vod_info.get('coverUrl'),
'timestamp': int_or_none(vod_info.get('publishTs')),
'duration': int_or_none(vod_info.get('duration')),
'view_count': get_count('watch'),
'like_count': get_count('like'),
'comment_count': get_count('comment'),
'comments': comments,
'categories': [category] if category else None,
}
info.update(self._extract_streamer_info(vod_detail_info))
return info

View File

@@ -20,7 +20,7 @@ from ..utils import (
class TV2IE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?tv2\.no/v/(?P<id>\d+)'
_TEST = {
_TESTS = [{
'url': 'http://www.tv2.no/v/916509/',
'info_dict': {
'id': '916509',
@@ -33,7 +33,7 @@ class TV2IE(InfoExtractor):
'view_count': int,
'categories': list,
},
}
}]
_API_DOMAIN = 'sumo.tv2.no'
_PROTOCOLS = ('HDS', 'HLS', 'DASH')
_GEO_COUNTRIES = ['NO']
@@ -42,6 +42,12 @@ class TV2IE(InfoExtractor):
video_id = self._match_id(url)
api_base = 'http://%s/api/web/asset/%s' % (self._API_DOMAIN, video_id)
asset = self._download_json(
api_base + '.json', video_id,
'Downloading metadata JSON')['asset']
title = asset.get('subtitle') or asset['title']
is_live = asset.get('live') is True
formats = []
format_urls = []
for protocol in self._PROTOCOLS:
@@ -81,7 +87,8 @@ class TV2IE(InfoExtractor):
elif ext == 'm3u8':
if not data.get('drmProtected'):
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
video_url, video_id, 'mp4',
'm3u8' if is_live else 'm3u8_native',
m3u8_id=format_id, fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
@@ -99,11 +106,6 @@ class TV2IE(InfoExtractor):
raise ExtractorError('This video is DRM protected.', expected=True)
self._sort_formats(formats)
asset = self._download_json(
api_base + '.json', video_id,
'Downloading metadata JSON')['asset']
title = asset['title']
thumbnails = [{
'id': thumbnail.get('@type'),
'url': thumbnail.get('url'),
@@ -112,7 +114,7 @@ class TV2IE(InfoExtractor):
return {
'id': video_id,
'url': video_url,
'title': title,
'title': self._live_title(title) if is_live else title,
'description': strip_or_none(asset.get('description')),
'thumbnails': thumbnails,
'timestamp': parse_iso8601(asset.get('createTime')),
@@ -120,6 +122,7 @@ class TV2IE(InfoExtractor):
'view_count': int_or_none(asset.get('views')),
'categories': asset.get('keywords', '').split(','),
'formats': formats,
'is_live': is_live,
}
@@ -168,13 +171,13 @@ class TV2ArticleIE(InfoExtractor):
class KatsomoIE(TV2IE):
_VALID_URL = r'https?://(?:www\.)?(?:katsomo|mtv)\.fi/(?:#!/)?(?:[^/]+/[0-9a-z-]+-\d+/[0-9a-z-]+-|[^/]+/\d+/[^/]+/)(?P<id>\d+)'
_TEST = {
_VALID_URL = r'https?://(?:www\.)?(?:katsomo|mtv(uutiset)?)\.fi/(?:sarja/[0-9a-z-]+-\d+/[0-9a-z-]+-|(?:#!/)?jakso/(?:\d+/[^/]+/)?|video/prog)(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.mtv.fi/sarja/mtv-uutiset-live-33001002003/lahden-pelicans-teki-kovan-ratkaisun-ville-nieminen-pihalle-1181321',
'info_dict': {
'id': '1181321',
'ext': 'mp4',
'title': 'MTV Uutiset Live',
'title': 'Lahden Pelicans teki kovan ratkaisun Ville Nieminen pihalle',
'description': 'Päätöksen teki Pelicansin hallitus.',
'timestamp': 1575116484,
'upload_date': '20191130',
@@ -186,7 +189,60 @@ class KatsomoIE(TV2IE):
# m3u8 download
'skip_download': True,
},
}
}, {
'url': 'http://www.katsomo.fi/#!/jakso/33001005/studio55-fi/658521/jukka-kuoppamaki-tekee-yha-lauluja-vaikka-lentokoneessa',
'only_matching': True,
}, {
'url': 'https://www.mtvuutiset.fi/video/prog1311159',
'only_matching': True,
}, {
'url': 'https://www.katsomo.fi/#!/jakso/1311159',
'only_matching': True,
}]
_API_DOMAIN = 'api.katsomo.fi'
_PROTOCOLS = ('HLS', 'MPD')
_GEO_COUNTRIES = ['FI']
class MTVUutisetArticleIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)mtvuutiset\.fi/artikkeli/[^/]+/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.mtvuutiset.fi/artikkeli/tallaisia-vaurioita-viking-amorellassa-on-useamman-osaston-alla-vetta/7931384',
'info_dict': {
'id': '1311159',
'ext': 'mp4',
'title': 'Viking Amorellan matkustajien evakuointi on alkanut tältä operaatio näyttää laivalla',
'description': 'Viking Amorellan matkustajien evakuointi on alkanut tältä operaatio näyttää laivalla',
'timestamp': 1600608966,
'upload_date': '20200920',
'duration': 153.7886666,
'view_count': int,
'categories': list,
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
# multiple Youtube embeds
'url': 'https://www.mtvuutiset.fi/artikkeli/50-vuotta-subarun-vastaiskua/6070962',
'only_matching': True,
}]
def _real_extract(self, url):
article_id = self._match_id(url)
article = self._download_json(
'http://api.mtvuutiset.fi/mtvuutiset/api/json/' + article_id,
article_id)
def entries():
for video in (article.get('videos') or []):
video_type = video.get('videotype')
video_url = video.get('url')
if not (video_url and video_type in ('katsomo', 'youtube')):
continue
yield self.url_result(
video_url, video_type.capitalize(), video.get('video_id'))
return self.playlist_result(
entries(), article_id, article.get('title'), article.get('description'))

View File

@@ -17,7 +17,7 @@ class TV4IE(InfoExtractor):
tv4\.se/(?:[^/]+)/klipp/(?:.*)-|
tv4play\.se/
(?:
(?:program|barn)/(?:[^/]+/|(?:[^\?]+)\?video_id=)|
(?:program|barn)/(?:(?:[^/]+/){1,2}|(?:[^\?]+)\?video_id=)|
iframe/video/|
film/|
sport/|
@@ -65,6 +65,10 @@ class TV4IE(InfoExtractor):
{
'url': 'http://www.tv4play.se/program/farang/3922081',
'only_matching': True,
},
{
'url': 'https://www.tv4play.se/program/nyheterna/avsnitt/13315940',
'only_matching': True,
}
]

View File

@@ -4,7 +4,13 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import int_or_none
from ..utils import (
int_or_none,
parse_iso8601,
str_or_none,
strip_or_none,
try_get,
)
class VidioIE(InfoExtractor):
@@ -21,57 +27,63 @@ class VidioIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 149,
'like_count': int,
'uploader': 'TWELVE Pic',
'timestamp': 1444902800,
'upload_date': '20151015',
'uploader_id': 'twelvepictures',
'channel': 'Cover Music Video',
'channel_id': '280236',
'view_count': int,
'dislike_count': int,
'comment_count': int,
'tags': 'count:4',
},
}, {
'url': 'https://www.vidio.com/watch/77949-south-korea-test-fires-missile-that-can-strike-all-of-the-north',
'only_matching': True,
}]
def _real_initialize(self):
self._api_key = self._download_json(
'https://www.vidio.com/auth', None, data=b'')['api_key']
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id, display_id = mobj.group('id', 'display_id')
video_id, display_id = re.match(self._VALID_URL, url).groups()
data = self._download_json(
'https://api.vidio.com/videos/' + video_id, display_id, headers={
'Content-Type': 'application/vnd.api+json',
'X-API-KEY': self._api_key,
})
video = data['videos'][0]
title = video['title'].strip()
webpage = self._download_webpage(url, display_id)
title = self._og_search_title(webpage)
m3u8_url, duration, thumbnail = [None] * 3
clips = self._parse_json(
self._html_search_regex(
r'data-json-clips\s*=\s*(["\'])(?P<data>\[.+?\])\1',
webpage, 'video data', default='[]', group='data'),
display_id, fatal=False)
if clips:
clip = clips[0]
m3u8_url = clip.get('sources', [{}])[0].get('file')
duration = clip.get('clip_duration')
thumbnail = clip.get('image')
m3u8_url = m3u8_url or self._search_regex(
r'data(?:-vjs)?-clip-hls-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
webpage, 'hls url', group='url')
formats = self._extract_m3u8_formats(
m3u8_url, display_id, 'mp4', entry_protocol='m3u8_native')
data['clips'][0]['hls_url'], display_id, 'mp4', 'm3u8_native')
self._sort_formats(formats)
duration = int_or_none(duration or self._search_regex(
r'data-video-duration=(["\'])(?P<duration>\d+)\1', webpage,
'duration', fatal=False, group='duration'))
thumbnail = thumbnail or self._og_search_thumbnail(webpage)
like_count = int_or_none(self._search_regex(
(r'<span[^>]+data-comment-vote-count=["\'](\d+)',
r'<span[^>]+class=["\'].*?\blike(?:__|-)count\b.*?["\'][^>]*>\s*(\d+)'),
webpage, 'like count', fatal=False))
get_first = lambda x: try_get(data, lambda y: y[x + 's'][0], dict) or {}
channel = get_first('channel')
user = get_first('user')
username = user.get('username')
get_count = lambda x: int_or_none(video.get('total_' + x))
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': self._og_search_description(webpage),
'thumbnail': thumbnail,
'duration': duration,
'like_count': like_count,
'description': strip_or_none(video.get('description')),
'thumbnail': video.get('image_url_medium'),
'duration': int_or_none(video.get('duration')),
'like_count': get_count('likes'),
'formats': formats,
'uploader': user.get('name'),
'timestamp': parse_iso8601(video.get('created_at')),
'uploader_id': username,
'uploader_url': 'https://www.vidio.com/@' + username if username else None,
'channel': channel.get('name'),
'channel_id': str_or_none(channel.get('id')),
'view_count': get_count('view_count'),
'dislike_count': get_count('dislikes'),
'comment_count': get_count('comments'),
'tags': video.get('tag_list'),
}

View File

@@ -125,7 +125,7 @@ class VLiveIE(VLiveBaseIE):
headers={'Referer': 'https://www.vlive.tv/'}, query=query)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
self.raise_login_required(json.loads(e.cause.read().decode())['message'])
self.raise_login_required(json.loads(e.cause.read().decode('utf-8'))['message'])
raise
def _real_extract(self, url):

View File

@@ -0,0 +1,62 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
try_get,
)
class VTMIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?vtm\.be/([^/?&#]+)~v(?P<id>[0-9a-f]{8}(?:-[0-9a-f]{4}){3}-[0-9a-f]{12})'
_TEST = {
'url': 'https://vtm.be/gast-vernielt-genkse-hotelkamer~ve7534523-279f-4b4d-a5c9-a33ffdbe23e1',
'md5': '37dca85fbc3a33f2de28ceb834b071f8',
'info_dict': {
'id': '192445',
'ext': 'mp4',
'title': 'Gast vernielt Genkse hotelkamer',
'timestamp': 1611060180,
'upload_date': '20210119',
'duration': 74,
# TODO: fix url _type result processing
# 'series': 'Op Interventie',
}
}
def _real_extract(self, url):
uuid = self._match_id(url)
video = self._download_json(
'https://omc4vm23offuhaxx6hekxtzspi.appsync-api.eu-west-1.amazonaws.com/graphql',
uuid, query={
'query': '''{
getComponent(type: Video, uuid: "%s") {
... on Video {
description
duration
myChannelsVideo
program {
title
}
publishedAt
title
}
}
}''' % uuid,
}, headers={
'x-api-key': 'da2-lz2cab4tfnah3mve6wiye4n77e',
})['data']['getComponent']
return {
'_type': 'url',
'id': uuid,
'title': video.get('title'),
'url': 'http://mychannels.video/embed/%d' % video['myChannelsVideo'],
'description': video.get('description'),
'timestamp': parse_iso8601(video.get('publishedAt')),
'duration': int_or_none(video.get('duration')),
'series': try_get(video, lambda x: x['program']['title']),
'ie_key': 'Medialaan',
}

View File

@@ -4,6 +4,7 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .youtube import YoutubeIE
from ..utils import (
ExtractorError,
int_or_none,
@@ -47,6 +48,22 @@ class VVVVIDIE(InfoExtractor):
'params': {
'skip_download': True,
},
}, {
# video_type == 'video/youtube'
'url': 'https://www.vvvvid.it/show/404/one-punch-man/406/486683/trailer',
'md5': '33e0edfba720ad73a8782157fdebc648',
'info_dict': {
'id': 'RzmFKUDOUgw',
'ext': 'mp4',
'title': 'Trailer',
'upload_date': '20150906',
'description': 'md5:a5e802558d35247fee285875328c0b80',
'uploader_id': 'BandaiVisual',
'uploader': 'BANDAI NAMCO Arts Channel',
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.vvvvid.it/show/434/perche-dovrei-guardarlo-di-dario-moccia/437/489048',
'only_matching': True
@@ -154,12 +171,13 @@ class VVVVIDIE(InfoExtractor):
if season_number:
info['season_number'] = int(season_number)
for quality in ('_sd', ''):
video_type = video_data.get('video_type')
is_youtube = False
for quality in ('', '_sd'):
embed_code = video_data.get('embed_info' + quality)
if not embed_code:
continue
embed_code = ds(embed_code)
video_type = video_data.get('video_type')
if video_type in ('video/rcs', 'video/kenc'):
if video_type == 'video/kenc':
kenc = self._download_json(
@@ -172,19 +190,28 @@ class VVVVIDIE(InfoExtractor):
if kenc_message:
embed_code += '?' + ds(kenc_message)
formats.extend(self._extract_akamai_formats(embed_code, video_id))
elif video_type == 'video/youtube':
info.update({
'_type': 'url_transparent',
'ie_key': YoutubeIE.ie_key(),
'url': embed_code,
})
is_youtube = True
break
else:
formats.extend(self._extract_wowza_formats(
'http://sb.top-ix.org/videomg/_definst_/mp4:%s/playlist.m3u8' % embed_code, video_id))
metadata_from_url(embed_code)
self._sort_formats(formats)
if not is_youtube:
self._sort_formats(formats)
info['formats'] = formats
metadata_from_url(video_data.get('thumbnail'))
info.update(self._extract_common_video_info(video_data))
info.update({
'id': video_id,
'title': title,
'formats': formats,
'duration': int_or_none(video_data.get('length')),
'series': video_data.get('show_title'),
'season_id': season_id,

View File

@@ -1,12 +1,9 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
unified_strdate,
HEADRequest,
int_or_none,
@@ -46,15 +43,6 @@ class WatIE(InfoExtractor):
},
]
_FORMATS = (
(200, 416, 234),
(400, 480, 270),
(600, 640, 360),
(1200, 640, 360),
(1800, 960, 540),
(2500, 1280, 720),
)
def _real_extract(self, url):
video_id = self._match_id(url)
video_id = video_id if video_id.isdigit() and len(video_id) > 6 else compat_str(int(video_id, 36))
@@ -97,46 +85,20 @@ class WatIE(InfoExtractor):
return red_url
return None
def remove_bitrate_limit(manifest_url):
return re.sub(r'(?:max|min)_bitrate=\d+&?', '', manifest_url)
formats = []
try:
alt_urls = lambda manifest_url: [re.sub(r'(?:wdv|ssm)?\.ism/', repl + '.ism/', manifest_url) for repl in ('', 'ssm')]
manifest_urls = self._download_json(
'http://www.wat.tv/get/webhtml/' + video_id, video_id)
m3u8_url = manifest_urls.get('hls')
if m3u8_url:
m3u8_url = remove_bitrate_limit(m3u8_url)
for m3u8_alt_url in alt_urls(m3u8_url):
formats.extend(self._extract_m3u8_formats(
m3u8_alt_url, video_id, 'mp4',
'm3u8_native', m3u8_id='hls', fatal=False))
formats.extend(self._extract_f4m_formats(
m3u8_alt_url.replace('ios', 'web').replace('.m3u8', '.f4m'),
video_id, f4m_id='hds', fatal=False))
mpd_url = manifest_urls.get('mpd')
if mpd_url:
mpd_url = remove_bitrate_limit(mpd_url)
for mpd_alt_url in alt_urls(mpd_url):
formats.extend(self._extract_mpd_formats(
mpd_alt_url, video_id, mpd_id='dash', fatal=False))
self._sort_formats(formats)
except ExtractorError:
abr = 64
for vbr, width, height in self._FORMATS:
tbr = vbr + abr
format_id = 'http-%s' % tbr
fmt_url = 'http://dnl.adv.tf1.fr/2/USP-0x0/%s/%s/%s/ssm/%s-%s-64k.mp4' % (video_id[-4:-2], video_id[-2:], video_id, video_id, vbr)
if self._is_valid_url(fmt_url, video_id, format_id):
formats.append({
'format_id': format_id,
'url': fmt_url,
'vbr': vbr,
'abr': abr,
'width': width,
'height': height,
})
manifest_urls = self._download_json(
'http://www.wat.tv/get/webhtml/' + video_id, video_id)
m3u8_url = manifest_urls.get('hls')
if m3u8_url:
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4',
'm3u8_native', m3u8_id='hls', fatal=False))
mpd_url = manifest_urls.get('mpd')
if mpd_url:
formats.extend(self._extract_mpd_formats(
mpd_url.replace('://das-q1.tf1.fr/', '://das-q1-ssl.tf1.fr/'),
video_id, mpd_id='dash', fatal=False))
self._sort_formats(formats)
date_diffusion = first_chapter.get('date_diffusion') or video_data.get('configv4', {}).get('estatS4')
upload_date = unified_strdate(date_diffusion) if date_diffusion else None

View File

@@ -177,46 +177,9 @@ class YahooIE(InfoExtractor):
'only_matching': True,
}]
def _real_extract(self, url):
url, country, display_id = re.match(self._VALID_URL, url).groups()
if not country:
country = 'us'
else:
country = country.split('-')[0]
api_base = 'https://%s.yahoo.com/_td/api/resource/' % country
for i, uuid in enumerate(['url=' + url, 'ymedia-alias=' + display_id]):
content = self._download_json(
api_base + 'content;getDetailView=true;uuids=["%s"]' % uuid,
display_id, 'Downloading content JSON metadata', fatal=i == 1)
if content:
item = content['items'][0]
break
if item.get('type') != 'video':
entries = []
cover = item.get('cover') or {}
if cover.get('type') == 'yvideo':
cover_url = cover.get('url')
if cover_url:
entries.append(self.url_result(
cover_url, 'Yahoo', cover.get('uuid')))
for e in item.get('body', []):
if e.get('type') == 'videoIframe':
iframe_url = e.get('url')
if not iframe_url:
continue
entries.append(self.url_result(iframe_url))
return self.playlist_result(
entries, item.get('uuid'),
item.get('title'), item.get('summary'))
video_id = item['uuid']
def _extract_yahoo_video(self, video_id, country):
video = self._download_json(
api_base + 'VideoService.videos;view=full;video_ids=["%s"]' % video_id,
'https://%s.yahoo.com/_td/api/resource/VideoService.videos;view=full;video_ids=["%s"]' % (country, video_id),
video_id, 'Downloading video JSON metadata')[0]
title = video['title']
@@ -298,7 +261,6 @@ class YahooIE(InfoExtractor):
'id': video_id,
'title': self._live_title(title) if is_live else title,
'formats': formats,
'display_id': display_id,
'thumbnails': thumbnails,
'description': clean_html(video.get('description')),
'timestamp': parse_iso8601(video.get('publish_time')),
@@ -311,6 +273,44 @@ class YahooIE(InfoExtractor):
'episode_number': int_or_none(series_info.get('episode_number')),
}
def _real_extract(self, url):
url, country, display_id = re.match(self._VALID_URL, url).groups()
if not country:
country = 'us'
else:
country = country.split('-')[0]
item = self._download_json(
'https://%s.yahoo.com/caas/content/article' % country, display_id,
'Downloading content JSON metadata', query={
'url': url
})['items'][0]['data']['partnerData']
if item.get('type') != 'video':
entries = []
cover = item.get('cover') or {}
if cover.get('type') == 'yvideo':
cover_url = cover.get('url')
if cover_url:
entries.append(self.url_result(
cover_url, 'Yahoo', cover.get('uuid')))
for e in (item.get('body') or []):
if e.get('type') == 'videoIframe':
iframe_url = e.get('url')
if not iframe_url:
continue
entries.append(self.url_result(iframe_url))
return self.playlist_result(
entries, item.get('uuid'),
item.get('title'), item.get('summary'))
info = self._extract_yahoo_video(item['uuid'], country)
info['display_id'] = display_id
return info
class YahooSearchIE(SearchInfoExtractor):
IE_DESC = 'Yahoo screen search'

File diff suppressed because it is too large Load Diff

View File

@@ -87,11 +87,16 @@ class ZypeIE(InfoExtractor):
r'(["\'])(?P<url>(?:(?!\1).)+\.m3u8(?:(?!\1).)*)\1',
body, 'm3u8 url', group='url', default=None)
if not m3u8_url:
source = self._parse_json(self._search_regex(
r'(?s)sources\s*:\s*\[\s*({.+?})\s*\]', body,
'source'), video_id, js_to_json)
if source.get('integration') == 'verizon-media':
m3u8_url = 'https://content.uplynk.com/%s.m3u8' % source['id']
source = self._search_regex(
r'(?s)sources\s*:\s*\[\s*({.+?})\s*\]', body, 'source')
def get_attr(key):
return self._search_regex(
r'\b%s\s*:\s*([\'"])(?P<val>(?:(?!\1).)+)\1' % key,
source, key, group='val')
if get_attr('integration') == 'verizon-media':
m3u8_url = 'https://content.uplynk.com/%s.m3u8' % get_attr('id')
formats = self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls')
text_tracks = self._search_regex(

View File

@@ -14,12 +14,18 @@ from .compat import (
compat_shlex_split,
)
from .utils import (
expand_path,
get_executable_path,
OUTTMPL_TYPES,
preferredencoding,
write_string,
)
from .version import __version__
_remux_formats = ('mp4', 'mkv', 'flv', 'webm', 'mov', 'avi', 'mp3', 'mka', 'm4a', 'ogg', 'opus')
def _hide_login_info(opts):
PRIVATE_OPTS = set(['-p', '--password', '-u', '--username', '--video-password', '--ap-password', '--ap-username'])
eqre = re.compile('^(?P<key>' + ('|'.join(re.escape(po) for po in PRIVATE_OPTS)) + ')=.+$')
@@ -62,7 +68,7 @@ def parseOpts(overrideArguments=None):
userConfFile = os.path.join(xdg_config_home, '%s.conf' % package_name)
userConf = _readOptions(userConfFile, default=None)
if userConf is not None:
return userConf
return userConf, userConfFile
# appdata
appdata_dir = compat_getenv('appdata')
@@ -70,19 +76,21 @@ def parseOpts(overrideArguments=None):
userConfFile = os.path.join(appdata_dir, package_name, 'config')
userConf = _readOptions(userConfFile, default=None)
if userConf is None:
userConf = _readOptions('%s.txt' % userConfFile, default=None)
userConfFile += '.txt'
userConf = _readOptions(userConfFile, default=None)
if userConf is not None:
return userConf
return userConf, userConfFile
# home
userConfFile = os.path.join(compat_expanduser('~'), '%s.conf' % package_name)
userConf = _readOptions(userConfFile, default=None)
if userConf is None:
userConf = _readOptions('%s.txt' % userConfFile, default=None)
userConfFile += '.txt'
userConf = _readOptions(userConfFile, default=None)
if userConf is not None:
return userConf
return userConf, userConfFile
return default
return default, None
def _format_option_string(option):
''' ('-o', '--option') -> -o, --format METAVAR'''
@@ -104,6 +112,20 @@ def parseOpts(overrideArguments=None):
def _comma_separated_values_options_callback(option, opt_str, value, parser):
setattr(parser.values, option.dest, value.split(','))
def _dict_from_multiple_values_options_callback(
option, opt_str, value, parser, allowed_keys=r'[\w-]+', delimiter=':', default_key=None, process=None):
out_dict = getattr(parser.values, option.dest)
mobj = re.match(r'(?i)(?P<key>%s)%s(?P<val>.*)$' % (allowed_keys, delimiter), value)
if mobj is not None:
key, val = mobj.group('key').lower(), mobj.group('val')
elif default_key is not None:
key, val = default_key, value
else:
raise optparse.OptionValueError(
'wrong %s formatting; it should be %s, not "%s"' % (opt_str, option.metavar, value))
out_dict[key] = process(val) if callable(process) else val
# No need to wrap help messages if we're on a wide console
columns = compat_get_terminal_size().columns
max_width = columns if columns else 80
@@ -173,7 +195,7 @@ def parseOpts(overrideArguments=None):
general.add_option(
'--config-location',
dest='config_location', metavar='PATH',
help='Location of the configuration file; either the path to the config or its containing directory')
help='Location of the main configuration file; either the path to the config or its containing directory')
general.add_option(
'--flat-playlist',
action='store_const', dest='extract_flat', const='in_playlist', default=False,
@@ -618,14 +640,21 @@ def parseOpts(overrideArguments=None):
'video while downloading (some players may not be able to play it)'))
downloader.add_option(
'--external-downloader',
dest='external_downloader', metavar='COMMAND',
dest='external_downloader', metavar='NAME',
help=(
'Use the specified external downloader. '
'Currently supports %s' % ','.join(list_external_downloaders())))
'Currently supports %s' % ', '.join(list_external_downloaders())))
downloader.add_option(
'--external-downloader-args',
dest='external_downloader_args', metavar='ARGS',
help='Give these arguments to the external downloader')
'--downloader-args', '--external-downloader-args',
metavar='NAME:ARGS', dest='external_downloader_args', default={}, type='str',
action='callback', callback=_dict_from_multiple_values_options_callback,
callback_kwargs={
'allowed_keys': '|'.join(list_external_downloaders()),
'default_key': 'default', 'process': compat_shlex_split},
help=(
'Give these arguments to the external downloader. '
'Specify the downloader name and the arguments separated by a colon ":". '
'You can use this option multiple times (Alias: --external-downloader-args)'))
workarounds = optparse.OptionGroup(parser, 'Workarounds')
workarounds.add_option(
@@ -651,8 +680,9 @@ def parseOpts(overrideArguments=None):
)
workarounds.add_option(
'--add-header',
metavar='FIELD:VALUE', dest='headers', action='append',
help='Specify a custom HTTP header and its value, separated by a colon \':\'. You can use this option multiple times',
metavar='FIELD:VALUE', dest='headers', default={}, type='str',
action='callback', callback=_dict_from_multiple_values_options_callback,
help='Specify a custom HTTP header and its value, separated by a colon ":". You can use this option multiple times',
)
workarounds.add_option(
'--bidi-workaround',
@@ -797,10 +827,33 @@ def parseOpts(overrideArguments=None):
filesystem.add_option(
'--id', default=False,
action='store_true', dest='useid', help=optparse.SUPPRESS_HELP)
filesystem.add_option(
'-P', '--paths',
metavar='TYPE:PATH', dest='paths', default={}, type='str',
action='callback', callback=_dict_from_multiple_values_options_callback,
callback_kwargs={
'allowed_keys': 'home|temp|%s' % '|'.join(OUTTMPL_TYPES.keys()),
'process': lambda x: x.strip()},
help=(
'The paths where the files should be downloaded. '
'Specify the type of file and the path separated by a colon ":". '
'All the same types as --output are supported. '
'Additionally, you can also provide "home" and "temp" paths. '
'All intermediary files are first downloaded to the temp path and '
'then the final files are moved over to the home path after download is finished. '
'This option is ignored if --output is an absolute path'))
filesystem.add_option(
'-o', '--output',
dest='outtmpl', metavar='TEMPLATE',
metavar='[TYPE:]TEMPLATE', dest='outtmpl', default={}, type='str',
action='callback', callback=_dict_from_multiple_values_options_callback,
callback_kwargs={
'allowed_keys': '|'.join(OUTTMPL_TYPES.keys()),
'default_key': 'default', 'process': lambda x: x.strip()},
help='Output filename template, see "OUTPUT TEMPLATE" for details')
filesystem.add_option(
'--output-na-placeholder',
dest='outtmpl_na_placeholder', metavar='TEXT', default='NA',
help=('Placeholder value for unavailable meta fields in output filename template (default: "%default")'))
filesystem.add_option(
'--autonumber-size',
dest='autonumber_size', metavar='NUMBER', type=int,
@@ -844,11 +897,13 @@ def parseOpts(overrideArguments=None):
filesystem.add_option(
'-c', '--continue',
action='store_true', dest='continue_dl', default=True,
help='Resume partially downloaded files (default)')
help='Resume partially downloaded files/fragments (default)')
filesystem.add_option(
'--no-continue',
action='store_false', dest='continue_dl',
help='Restart download of partially downloaded files from beginning')
help=(
'Do not resume partially downloaded fragments. '
'If the file is unfragmented, restart download of the entire file'))
filesystem.add_option(
'--part',
action='store_false', dest='nopart', default=False,
@@ -876,7 +931,7 @@ def parseOpts(overrideArguments=None):
filesystem.add_option(
'--write-info-json',
action='store_true', dest='writeinfojson', default=False,
help='Write video metadata to a .info.json file')
help='Write video metadata to a .info.json file (this may contain personal information)')
filesystem.add_option(
'--no-write-info-json',
action='store_false', dest='writeinfojson',
@@ -889,6 +944,22 @@ def parseOpts(overrideArguments=None):
'--no-write-annotations',
action='store_false', dest='writeannotations',
help='Do not write video annotations (default)')
filesystem.add_option(
'--write-playlist-metafiles',
action='store_true', dest='allow_playlist_files', default=True,
help=(
'Write playlist metadata in addition to the video metadata '
'when using --write-info-json, --write-description etc. (default)'))
filesystem.add_option(
'--no-write-playlist-metafiles',
action='store_false', dest='allow_playlist_files',
help=(
'Do not write playlist metadata when using '
'--write-info-json, --write-description etc.'))
filesystem.add_option(
'--get-comments',
action='store_true', dest='getcomments', default=False,
help='Retrieve video comments to be placed in the .info.json file')
filesystem.add_option(
'--load-info-json', '--load-info',
dest='load_info_filename', metavar='FILE',
@@ -956,37 +1027,44 @@ def parseOpts(overrideArguments=None):
postproc.add_option(
'-x', '--extract-audio',
action='store_true', dest='extractaudio', default=False,
help='Convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)')
help='Convert video files to audio-only files (requires ffmpeg and ffprobe)')
postproc.add_option(
'--audio-format', metavar='FORMAT', dest='audioformat', default='best',
help='Specify audio format: "best", "aac", "flac", "mp3", "m4a", "opus", "vorbis", or "wav"; "%default" by default; No effect without -x')
postproc.add_option(
'--audio-quality', metavar='QUALITY',
dest='audioquality', default='5',
help='Specify ffmpeg/avconv audio quality, insert a value between 0 (better) and 9 (worse) for VBR or a specific bitrate like 128K (default %default)')
help='Specify ffmpeg audio quality, insert a value between 0 (better) and 9 (worse) for VBR or a specific bitrate like 128K (default %default)')
postproc.add_option(
'--remux-video',
metavar='FORMAT', dest='remuxvideo', default=None,
help=(
'Remux the video into another container if necessary (currently supported: mp4|mkv). '
'If target container does not support the video/audio codec, remuxing will fail'))
'Remux the video into another container if necessary (currently supported: %s). '
'If target container does not support the video/audio codec, remuxing will fail. '
'You can specify multiple rules; eg. "aac>m4a/mov>mp4/mkv" will remux aac to m4a, mov to mp4 '
'and anything else to mkv.' % '|'.join(_remux_formats)))
postproc.add_option(
'--recode-video',
metavar='FORMAT', dest='recodevideo', default=None,
help='Re-encode the video into another format if re-encoding is necessary (currently supported: mp4|flv|ogg|webm|mkv|avi)')
help=(
'Re-encode the video into another format if re-encoding is necessary. '
'The supported formats are the same as --remux-video'))
postproc.add_option(
'--postprocessor-args', '--ppa', metavar='NAME:ARGS',
dest='postprocessor_args', action='append',
'--postprocessor-args', '--ppa',
metavar='NAME:ARGS', dest='postprocessor_args', default={}, type='str',
action='callback', callback=_dict_from_multiple_values_options_callback,
callback_kwargs={'default_key': 'default-compat', 'allowed_keys': r'\w+(?:\+\w+)?', 'process': compat_shlex_split},
help=(
'Give these arguments to the postprocessors. '
'Specify the postprocessor/executable name and the arguments separated by a colon ":" '
'to give the argument to only the specified postprocessor/executable. Supported postprocessors are: '
'to give the argument to the specified postprocessor/executable. Supported postprocessors are: '
'SponSkrub, ExtractAudio, VideoRemuxer, VideoConvertor, EmbedSubtitle, Metadata, Merger, '
'FixupStretched, FixupM4a, FixupM3u8, SubtitlesConvertor and EmbedThumbnail. '
'The supported executables are: SponSkrub, FFmpeg, FFprobe, avconf, avprobe and AtomicParsley. '
'The supported executables are: SponSkrub, FFmpeg, FFprobe, and AtomicParsley. '
'You can use this option multiple times to give different arguments to different postprocessors. '
'You can also specify "PP+EXE:ARGS" to give the arguments to the specified executable '
'only when being used by the specified postprocessor (Alias: --ppa)'))
'only when being used by the specified postprocessor. '
'You can use this option multiple times (Alias: --ppa)'))
postproc.add_option(
'-k', '--keep-video',
action='store_true', dest='keepvideo', default=False,
@@ -1030,14 +1108,20 @@ def parseOpts(overrideArguments=None):
postproc.add_option(
'--metadata-from-title',
metavar='FORMAT', dest='metafromtitle',
help=optparse.SUPPRESS_HELP)
postproc.add_option(
'--parse-metadata',
metavar='FIELD:FORMAT', dest='metafromfield', action='append',
help=(
'Parse additional metadata like song title / artist from the video title. '
'The format syntax is the same as --output. Regular expression with '
'named capture groups may also be used. '
'The parsed parameters replace existing values. '
'Example: --metadata-from-title "%(artist)s - %(title)s" matches a title like '
'Parse additional metadata like title/artist from other fields. '
'Give field name to extract data from, and format of the field seperated by a ":". '
'Either regular expression with named capture groups or a '
'similar syntax to the output template can also be used. '
'The parsed parameters replace any existing values and can be use in output template'
'This option can be used multiple times. '
'Example: --parse-metadata "title:%(artist)s - %(title)s" matches a title like '
'"Coldplay - Paradise". '
'Example (regex): --metadata-from-title "(?P<artist>.+?) - (?P<title>.+)"'))
'Example (regex): --parse-metadata "description:Artist - (?P<artist>.+?)"'))
postproc.add_option(
'--xattrs',
action='store_true', dest='xattrs', default=False,
@@ -1052,15 +1136,15 @@ def parseOpts(overrideArguments=None):
postproc.add_option(
'--prefer-avconv', '--no-prefer-ffmpeg',
action='store_false', dest='prefer_ffmpeg',
help='Prefer avconv over ffmpeg for running the postprocessors (Alias: --no-prefer-ffmpeg)')
help=optparse.SUPPRESS_HELP)
postproc.add_option(
'--prefer-ffmpeg', '--no-prefer-avconv',
action='store_true', dest='prefer_ffmpeg',
help='Prefer ffmpeg over avconv for running the postprocessors (default) (Alias: --no-prefer-avconv)')
action='store_true', dest='prefer_ffmpeg', default=True,
help=optparse.SUPPRESS_HELP)
postproc.add_option(
'--ffmpeg-location', '--avconv-location', metavar='PATH',
dest='ffmpeg_location',
help='Location of the ffmpeg/avconv binary; either the path to the binary or its containing directory (Alias: --avconv-location)')
help='Location of the ffmpeg binary; either the path to the binary or its containing directory')
postproc.add_option(
'--exec',
metavar='CMD', dest='exec_cmd',
@@ -1146,59 +1230,69 @@ def parseOpts(overrideArguments=None):
return conf
configs = {
'command_line': compat_conf(sys.argv[1:]),
'custom': [], 'portable': [], 'user': [], 'system': []}
opts, args = parser.parse_args(configs['command_line'])
'command-line': compat_conf(sys.argv[1:]),
'custom': [], 'home': [], 'portable': [], 'user': [], 'system': []}
paths = {'command-line': False}
opts, args = parser.parse_args(configs['command-line'])
def get_configs():
if '--config-location' in configs['command_line']:
if '--config-location' in configs['command-line']:
location = compat_expanduser(opts.config_location)
if os.path.isdir(location):
location = os.path.join(location, 'youtube-dlc.conf')
if not os.path.exists(location):
parser.error('config-location %s does not exist.' % location)
configs['custom'] = _readOptions(location)
if '--ignore-config' in configs['command_line']:
configs['custom'] = _readOptions(location, default=None)
if configs['custom'] is None:
configs['custom'] = []
else:
paths['custom'] = location
if '--ignore-config' in configs['command-line']:
return
if '--ignore-config' in configs['custom']:
return
def get_portable_path():
path = os.path.dirname(sys.argv[0])
if os.path.abspath(sys.argv[0]) != os.path.abspath(sys.executable): # Not packaged
path = os.path.join(path, '..')
return os.path.abspath(path)
run_path = get_portable_path()
configs['portable'] = _readOptions(os.path.join(run_path, 'yt-dlp.conf'), default=None)
if configs['portable'] is None:
configs['portable'] = _readOptions(os.path.join(run_path, 'youtube-dlc.conf'))
def read_options(path, user=False):
for package in ('yt-dlp', 'youtube-dlc'):
if user:
config, current_path = _readUserConf(package, default=None)
else:
current_path = os.path.join(path, '%s.conf' % package)
config = _readOptions(current_path, default=None)
if config is not None:
return config, current_path
return [], None
configs['portable'], paths['portable'] = read_options(get_executable_path())
if '--ignore-config' in configs['portable']:
return
configs['system'] = _readOptions('/etc/yt-dlp.conf', default=None)
if configs['system'] is None:
configs['system'] = _readOptions('/etc/youtube-dlc.conf')
def get_home_path():
opts = parser.parse_args(configs['portable'] + configs['custom'] + configs['command-line'])[0]
return expand_path(opts.paths.get('home', '')).strip()
configs['home'], paths['home'] = read_options(get_home_path())
if '--ignore-config' in configs['home']:
return
configs['system'], paths['system'] = read_options('/etc')
if '--ignore-config' in configs['system']:
return
configs['user'] = _readUserConf('yt-dlp', default=None)
if configs['user'] is None:
configs['user'] = _readUserConf('youtube-dlc')
configs['user'], paths['user'] = read_options('', True)
if '--ignore-config' in configs['user']:
configs['system'] = []
configs['system'], paths['system'] = [], None
get_configs()
argv = configs['system'] + configs['user'] + configs['portable'] + configs['custom'] + configs['command_line']
argv = configs['system'] + configs['user'] + configs['home'] + configs['portable'] + configs['custom'] + configs['command-line']
opts, args = parser.parse_args(argv)
if opts.verbose:
for conf_label, conf in (
('System config', configs['system']),
('User config', configs['user']),
('Portable config', configs['portable']),
('Custom config', configs['custom']),
('Command-line args', configs['command_line'])):
write_string('[debug] %s: %s\n' % (conf_label, repr(_hide_login_info(conf))))
for label in ('System', 'User', 'Portable', 'Home', 'Custom', 'Command-line'):
key = label.lower()
if paths.get(key) is None:
continue
if paths[key]:
write_string('[debug] %s config file: %s\n' % (label, paths[key]))
write_string('[debug] %s config: %s\n' % (label, repr(_hide_login_info(configs[key]))))
return parser, opts, args

View File

@@ -16,7 +16,9 @@ from .ffmpeg import (
)
from .xattrpp import XAttrMetadataPP
from .execafterdownload import ExecAfterDownloadPP
from .metadatafromtitle import MetadataFromTitlePP
from .metadatafromfield import MetadataFromFieldPP
from .metadatafromfield import MetadataFromTitlePP
from .movefilesafterdownload import MoveFilesAfterDownloadPP
from .sponskrub import SponSkrubPP
@@ -38,7 +40,9 @@ __all__ = [
'FFmpegSubtitlesConvertorPP',
'FFmpegVideoConvertorPP',
'FFmpegVideoRemuxerPP',
'MetadataFromFieldPP',
'MetadataFromTitlePP',
'MoveFilesAfterDownloadPP',
'SponSkrubPP',
'XAttrMetadataPP',
]

View File

@@ -4,8 +4,9 @@ import os
from ..compat import compat_str
from ..utils import (
PostProcessingError,
cli_configuration_args,
encodeFilename,
PostProcessingError,
)
@@ -55,7 +56,7 @@ class PostProcessor(object):
def write_debug(self, text, prefix=True, *args, **kwargs):
tag = '[debug] ' if prefix else ''
if self.get_param('verbose', False):
if self.get_param('verbose', False) and self._downloader:
return self._downloader.to_screen('%s%s' % (tag, text), *args, **kwargs)
def get_param(self, name, default=None, *args, **kwargs):
@@ -91,39 +92,10 @@ class PostProcessor(object):
self.report_warning(errnote)
def _configuration_args(self, default=[], exe=None):
args = self.get_param('postprocessor_args', {})
pp_key = self.pp_key().lower()
if isinstance(args, (list, tuple)): # for backward compatibility
return default if pp_key == 'sponskrub' else args
if args is None:
return default
assert isinstance(args, dict)
exe_args = None
if exe is not None:
assert isinstance(exe, compat_str)
exe = exe.lower()
specific_args = args.get('%s+%s' % (pp_key, exe))
if specific_args is not None:
assert isinstance(specific_args, (list, tuple))
return specific_args
exe_args = args.get(exe)
pp_args = args.get(pp_key) if pp_key != exe else None
if pp_args is None and exe_args is None:
default = args.get('default', default)
assert isinstance(default, (list, tuple))
return default
if pp_args is None:
pp_args = []
elif exe_args is None:
exe_args = []
assert isinstance(pp_args, (list, tuple))
assert isinstance(exe_args, (list, tuple))
return pp_args + exe_args
key = self.pp_key().lower()
args, is_compat = cli_configuration_args(
self._downloader.params, 'postprocessor_args', key, default, exe)
return args if not is_compat or key != 'sponskrub' else default
class AudioConversionError(PostProcessingError):

View File

@@ -4,6 +4,15 @@ from __future__ import unicode_literals
import os
import subprocess
import struct
import re
import base64
try:
import mutagen
_has_mutagen = True
except ImportError:
_has_mutagen = False
from .ffmpeg import FFmpegPostProcessor
@@ -11,11 +20,12 @@ from ..utils import (
check_executable,
encodeArgument,
encodeFilename,
error_to_compat_str,
PostProcessingError,
prepend_extension,
process_communicate_or_kill,
replace_extension,
shell_quote,
process_communicate_or_kill,
)
@@ -32,6 +42,7 @@ class EmbedThumbnailPP(FFmpegPostProcessor):
def run(self, info):
filename = info['filepath']
temp_filename = prepend_extension(filename, 'temp')
files_to_delete = []
if not info.get('thumbnails'):
self.to_screen('There aren\'t any thumbnails to embed')
@@ -68,11 +79,12 @@ class EmbedThumbnailPP(FFmpegPostProcessor):
escaped_thumbnail_jpg_filename = replace_extension(escaped_thumbnail_filename, 'jpg')
self.to_screen('Converting thumbnail "%s" to JPEG' % escaped_thumbnail_filename)
self.run_ffmpeg(escaped_thumbnail_filename, escaped_thumbnail_jpg_filename, ['-bsf:v', 'mjpeg2jpeg'])
os.remove(encodeFilename(escaped_thumbnail_filename))
files_to_delete.append(escaped_thumbnail_filename)
thumbnail_jpg_filename = replace_extension(thumbnail_filename, 'jpg')
# Rename back to unescaped for further processing
os.rename(encodeFilename(escaped_thumbnail_jpg_filename), encodeFilename(thumbnail_jpg_filename))
thumbnail_filename = thumbnail_jpg_filename
thumbnail_ext = 'jpg'
success = True
if info['ext'] == 'mp3':
@@ -83,47 +95,98 @@ class EmbedThumbnailPP(FFmpegPostProcessor):
self.to_screen('Adding thumbnail to "%s"' % filename)
self.run_ffmpeg_multiple_files([filename, thumbnail_filename], temp_filename, options)
elif info['ext'] == 'mkv':
options = [
'-c', 'copy', '-map', '0', '-dn', '-attach', thumbnail_filename,
'-metadata:s:t', 'mimetype=image/jpeg', '-metadata:s:t', 'filename=cover.jpg']
elif info['ext'] in ['mkv', 'mka']:
options = ['-c', 'copy', '-map', '0', '-dn']
mimetype = 'image/%s' % ('png' if thumbnail_ext == 'png' else 'jpeg')
old_stream, new_stream = self.get_stream_number(
filename, ('tags', 'mimetype'), mimetype)
if old_stream is not None:
options.extend(['-map', '-0:%d' % old_stream])
new_stream -= 1
options.extend([
'-attach', thumbnail_filename,
'-metadata:s:%d' % new_stream, 'mimetype=%s' % mimetype,
'-metadata:s:%d' % new_stream, 'filename=cover.%s' % thumbnail_ext])
self.to_screen('Adding thumbnail to "%s"' % filename)
self.run_ffmpeg_multiple_files([filename], temp_filename, options)
self.run_ffmpeg(filename, temp_filename, options)
elif info['ext'] in ['m4a', 'mp4']:
if not check_executable('AtomicParsley', ['-v']):
raise EmbedThumbnailPPError('AtomicParsley was not found. Please install.')
elif info['ext'] in ['m4a', 'mp4', 'mov']:
try:
options = ['-c', 'copy', '-map', '0', '-dn', '-map', '1']
cmd = [encodeFilename('AtomicParsley', True),
encodeFilename(filename, True),
encodeArgument('--artwork'),
encodeFilename(thumbnail_filename, True),
encodeArgument('-o'),
encodeFilename(temp_filename, True)]
cmd += [encodeArgument(o) for o in self._configuration_args(exe='AtomicParsley')]
old_stream, new_stream = self.get_stream_number(
filename, ('disposition', 'attached_pic'), 1)
if old_stream is not None:
options.extend(['-map', '-0:%d' % old_stream])
new_stream -= 1
options.extend(['-disposition:%s' % new_stream, 'attached_pic'])
self.to_screen('Adding thumbnail to "%s"' % filename)
self.run_ffmpeg_multiple_files([filename, thumbnail_filename], temp_filename, options)
except PostProcessingError as err:
self.report_warning('unable to embed using ffprobe & ffmpeg; %s' % error_to_compat_str(err))
if not check_executable('AtomicParsley', ['-v']):
raise EmbedThumbnailPPError('AtomicParsley was not found. Please install.')
cmd = [encodeFilename('AtomicParsley', True),
encodeFilename(filename, True),
encodeArgument('--artwork'),
encodeFilename(thumbnail_filename, True),
encodeArgument('-o'),
encodeFilename(temp_filename, True)]
cmd += [encodeArgument(o) for o in self._configuration_args(exe='AtomicParsley')]
self.to_screen('Adding thumbnail to "%s"' % filename)
self.write_debug('AtomicParsley command line: %s' % shell_quote(cmd))
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = process_communicate_or_kill(p)
if p.returncode != 0:
msg = stderr.decode('utf-8', 'replace').strip()
raise EmbedThumbnailPPError(msg)
# for formats that don't support thumbnails (like 3gp) AtomicParsley
# won't create to the temporary file
if b'No changes' in stdout:
self.report_warning('The file format doesn\'t support embedding a thumbnail')
success = False
elif info['ext'] in ['ogg', 'opus']:
if not _has_mutagen:
raise EmbedThumbnailPPError('module mutagen was not found. Please install using `python -m pip install mutagen`')
self.to_screen('Adding thumbnail to "%s"' % filename)
self.write_debug('AtomicParsley command line: %s' % shell_quote(cmd))
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = process_communicate_or_kill(p)
size_regex = r',\s*(?P<w>\d+)x(?P<h>\d+)\s*[,\[]'
size_result = self.run_ffmpeg(thumbnail_filename, thumbnail_filename, ['-hide_banner'])
mobj = re.search(size_regex, size_result)
width, height = int(mobj.group('w')), int(mobj.group('h'))
mimetype = ('image/%s' % ('png' if thumbnail_ext == 'png' else 'jpeg')).encode('ascii')
if p.returncode != 0:
msg = stderr.decode('utf-8', 'replace').strip()
raise EmbedThumbnailPPError(msg)
# for formats that don't support thumbnails (like 3gp) AtomicParsley
# won't create to the temporary file
if b'No changes' in stdout:
self.report_warning('The file format doesn\'t support embedding a thumbnail')
success = False
# https://xiph.org/flac/format.html#metadata_block_picture
data = bytearray()
data += struct.pack('>II', 3, len(mimetype))
data += mimetype
data += struct.pack('>IIIIII', 0, width, height, 8, 0, os.stat(thumbnail_filename).st_size) # 32 if png else 24
fin = open(thumbnail_filename, "rb")
data += fin.read()
fin.close()
temp_filename = filename
f = mutagen.File(temp_filename)
f.tags['METADATA_BLOCK_PICTURE'] = base64.b64encode(data).decode('ascii')
f.save()
else:
raise EmbedThumbnailPPError('Only mp3, mkv, m4a and mp4 are supported for thumbnail embedding for now.')
raise EmbedThumbnailPPError('Supported filetypes for thumbnail embedding are: mp3, mkv/mka, ogg/opus, m4a/mp4/mov')
if success:
if success and temp_filename != filename:
os.remove(encodeFilename(filename))
os.rename(encodeFilename(temp_filename), encodeFilename(filename))
files_to_delete = [] if self._already_have_thumbnail else [thumbnail_filename]
if self._already_have_thumbnail:
info['__files_to_move'][thumbnail_filename] = replace_extension(
info['__thumbnail_filename'], os.path.splitext(thumbnail_filename)[1][1:])
else:
files_to_delete.append(thumbnail_filename)
return files_to_delete, info

View File

@@ -5,6 +5,7 @@ import os
import subprocess
import time
import re
import json
from .common import AudioConversionError, PostProcessor
@@ -20,8 +21,9 @@ from ..utils import (
subtitles_filename,
dfxp2srt,
ISO639Utils,
replace_extension,
process_communicate_or_kill,
replace_extension,
traverse_dict,
)
@@ -59,7 +61,7 @@ class FFmpegPostProcessor(PostProcessor):
def check_version(self):
if not self.available:
raise FFmpegPostProcessorError('ffmpeg or avconv not found. Please install one.')
raise FFmpegPostProcessorError('ffmpeg not found. Please install')
required_version = '10-0' if self.basename == 'avconv' else '1.0'
if is_outdated_version(
@@ -102,7 +104,7 @@ class FFmpegPostProcessor(PostProcessor):
if not os.path.exists(location):
self.report_warning(
'ffmpeg-location %s does not exist! '
'Continuing without avconv/ffmpeg.' % (location))
'Continuing without ffmpeg.' % (location))
self._versions = {}
return
elif not os.path.isdir(location):
@@ -110,7 +112,7 @@ class FFmpegPostProcessor(PostProcessor):
if basename not in programs:
self.report_warning(
'Cannot identify executable %s, its basename should be one of %s. '
'Continuing without avconv/ffmpeg.' %
'Continuing without ffmpeg.' %
(location, ', '.join(programs)))
self._versions = {}
return None
@@ -163,7 +165,7 @@ class FFmpegPostProcessor(PostProcessor):
def get_audio_codec(self, path):
if not self.probe_available and not self.available:
raise PostProcessingError('ffprobe/avprobe and ffmpeg/avconv not found. Please install one.')
raise PostProcessingError('ffprobe and ffmpeg not found. Please install')
try:
if self.probe_available:
cmd = [
@@ -201,6 +203,37 @@ class FFmpegPostProcessor(PostProcessor):
return mobj.group(1)
return None
def get_metadata_object(self, path, opts=[]):
if self.probe_basename != 'ffprobe':
if self.probe_available:
self.report_warning('Only ffprobe is supported for metadata extraction')
raise PostProcessingError('ffprobe not found. Please install.')
self.check_version()
cmd = [
encodeFilename(self.probe_executable, True),
encodeArgument('-hide_banner'),
encodeArgument('-show_format'),
encodeArgument('-show_streams'),
encodeArgument('-print_format'),
encodeArgument('json'),
]
cmd += opts
cmd.append(encodeFilename(self._ffmpeg_filename_argument(path), True))
if self._downloader.params.get('verbose', False):
self._downloader.to_screen('[debug] ffprobe command line: %s' % shell_quote(cmd))
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
stdout, stderr = p.communicate()
return json.loads(stdout.decode('utf-8', 'replace'))
def get_stream_number(self, path, keys, value):
streams = self.get_metadata_object(path)['streams']
num = next(
(i for i, stream in enumerate(streams) if traverse_dict(stream, keys, casesense=False) == value),
None)
return num, len(streams)
def run_ffmpeg_multiple_files(self, input_paths, out_path, opts):
self.check_version()
@@ -227,19 +260,23 @@ class FFmpegPostProcessor(PostProcessor):
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
stdout, stderr = process_communicate_or_kill(p)
if p.returncode != 0:
stderr = stderr.decode('utf-8', 'replace')
msg = stderr.strip().split('\n')[-1]
raise FFmpegPostProcessorError(msg)
stderr = stderr.decode('utf-8', 'replace').strip()
if self._downloader.params.get('verbose', False):
self.report_error(stderr)
raise FFmpegPostProcessorError(stderr.split('\n')[-1])
self.try_utime(out_path, oldest_mtime, oldest_mtime)
return stderr.decode('utf-8', 'replace')
def run_ffmpeg(self, path, out_path, opts):
self.run_ffmpeg_multiple_files([path], out_path, opts)
return self.run_ffmpeg_multiple_files([path], out_path, opts)
def _ffmpeg_filename_argument(self, fn):
# Always use 'file:' because the filename may contain ':' (ffmpeg
# interprets that as a protocol) or can start with '-' (-- is broken in
# ffmpeg, see https://ffmpeg.org/trac/ffmpeg/ticket/2127 for details)
# Also leave '-' intact in order not to break streaming to stdout.
if fn.startswith(('http://', 'https://')):
return fn
return 'file:' + fn if fn != '-' else fn
@@ -349,21 +386,35 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
class FFmpegVideoRemuxerPP(FFmpegPostProcessor):
def __init__(self, downloader=None, preferedformat=None):
super(FFmpegVideoRemuxerPP, self).__init__(downloader)
self._preferedformat = preferedformat
self._preferedformats = preferedformat.lower().split('/')
def run(self, information):
path = information['filepath']
if information['ext'] == self._preferedformat:
self.to_screen('Not remuxing video file %s - already is in target format %s' % (path, self._preferedformat))
sourceext, targetext = information['ext'].lower(), None
for pair in self._preferedformats:
kv = pair.split('>')
if len(kv) == 1 or kv[0].strip() == sourceext:
targetext = kv[-1].strip()
break
_skip_msg = (
'could not find a mapping for %s' if not targetext
else 'already is in target format %s' if sourceext == targetext
else None)
if _skip_msg:
self.to_screen('Not remuxing media file %s; %s' % (path, _skip_msg % sourceext))
return [], information
options = ['-c', 'copy', '-map', '0', '-dn']
prefix, sep, ext = path.rpartition('.')
outpath = prefix + sep + self._preferedformat
self.to_screen('Remuxing video from %s to %s, Destination: ' % (information['ext'], self._preferedformat) + outpath)
if targetext in ['mp4', 'm4a', 'mov']:
options.extend(['-movflags', '+faststart'])
prefix, sep, oldext = path.rpartition('.')
outpath = prefix + sep + targetext
self.to_screen('Remuxing video from %s to %s; Destination: %s' % (sourceext, targetext, outpath))
self.run_ffmpeg(path, outpath, options)
information['filepath'] = outpath
information['format'] = self._preferedformat
information['ext'] = self._preferedformat
information['format'] = targetext
information['ext'] = targetext
return [path], information
@@ -406,18 +457,22 @@ class FFmpegEmbedSubtitlePP(FFmpegPostProcessor):
sub_langs = []
sub_filenames = []
webm_vtt_warn = False
mp4_ass_warn = False
for lang, sub_info in subtitles.items():
sub_ext = sub_info['ext']
if sub_ext == 'json':
self.to_screen('JSON subtitles cannot be embedded')
self.report_warning('JSON subtitles cannot be embedded')
elif ext != 'webm' or ext == 'webm' and sub_ext == 'vtt':
sub_langs.append(lang)
sub_filenames.append(subtitles_filename(filename, lang, sub_ext, ext))
else:
if not webm_vtt_warn and ext == 'webm' and sub_ext != 'vtt':
webm_vtt_warn = True
self.to_screen('Only WebVTT subtitles can be embedded in webm files')
self.report_warning('Only WebVTT subtitles can be embedded in webm files')
if not mp4_ass_warn and ext == 'mp4' and sub_ext == 'ass':
mp4_ass_warn = True
self.report_warning('ASS subtitles cannot be properly embedded in mp4 files; expect issues')
if not sub_langs:
return [], information
@@ -441,7 +496,7 @@ class FFmpegEmbedSubtitlePP(FFmpegPostProcessor):
opts.extend(['-metadata:s:s:%d' % i, 'language=%s' % lang_code])
temp_filename = prepend_extension(filename, 'temp')
self.to_screen('Embedding subtitles in \'%s\'' % filename)
self.to_screen('Embedding subtitles in "%s"' % filename)
self.run_ffmpeg_multiple_files(input_files, temp_filename, opts)
os.remove(encodeFilename(filename))
os.rename(encodeFilename(temp_filename), encodeFilename(filename))
@@ -471,7 +526,6 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
# 1. https://kdenlive.org/en/project/adding-meta-data-to-mp4-video/
# 2. https://wiki.multimedia.cx/index.php/FFmpeg_Metadata
# 3. https://kodi.wiki/view/Video_file_tagging
# 4. http://atomicparsley.sourceforge.net/mpeg-4files.html
add('title', ('track', 'title'))
add('date', 'upload_date')
@@ -524,6 +578,18 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
in_filenames.append(metadata_filename)
options.extend(['-map_metadata', '1'])
if '__infojson_filename' in info and info['ext'] in ('mkv', 'mka'):
old_stream, new_stream = self.get_stream_number(
filename, ('tags', 'mimetype'), 'application/json')
if old_stream is not None:
options.extend(['-map', '-0:%d' % old_stream])
new_stream -= 1
options.extend([
'-attach', info['__infojson_filename'],
'-metadata:s:%d' % new_stream, 'mimetype=application/json'
])
self.to_screen('Adding metadata to \'%s\'' % filename)
self.run_ffmpeg_multiple_files(in_filenames, temp_filename, options)
if chapters:

View File

@@ -0,0 +1,71 @@
from __future__ import unicode_literals
import re
from .common import PostProcessor
from ..compat import compat_str
from ..utils import str_or_none
class MetadataFromFieldPP(PostProcessor):
regex = r'(?P<field>\w+):(?P<format>.+)$'
def __init__(self, downloader, formats):
PostProcessor.__init__(self, downloader)
assert isinstance(formats, (list, tuple))
self._data = []
for f in formats:
assert isinstance(f, compat_str)
match = re.match(self.regex, f)
assert match is not None
self._data.append({
'field': match.group('field'),
'format': match.group('format'),
'regex': self.format_to_regex(match.group('format'))})
def format_to_regex(self, fmt):
r"""
Converts a string like
'%(title)s - %(artist)s'
to a regex like
'(?P<title>.+)\ \-\ (?P<artist>.+)'
"""
if not re.search(r'%\(\w+\)s', fmt):
return fmt
lastpos = 0
regex = ''
# replace %(..)s with regex group and escape other string parts
for match in re.finditer(r'%\((\w+)\)s', fmt):
regex += re.escape(fmt[lastpos:match.start()])
regex += r'(?P<' + match.group(1) + r'>[^\r\n]+)'
lastpos = match.end()
if lastpos < len(fmt):
regex += re.escape(fmt[lastpos:])
return regex
def run(self, info):
for dictn in self._data:
field, regex = dictn['field'], dictn['regex']
if field not in info:
self.report_warning('Video doesnot have a %s' % field)
continue
data_to_parse = str_or_none(info[field])
if data_to_parse is None:
self.report_warning('Field %s cannot be parsed' % field)
continue
self.write_debug('Searching for r"%s" in %s' % (regex, field))
match = re.search(regex, data_to_parse)
if match is None:
self.report_warning('Could not interpret video %s as "%s"' % (field, dictn['format']))
continue
for attribute, value in match.groupdict().items():
info[attribute] = value
self.to_screen('parsed %s from %s: %s' % (attribute, field, value if value is not None else 'NA'))
return [], info
class MetadataFromTitlePP(MetadataFromFieldPP): # for backward compatibility
def __init__(self, downloader, titleformat):
super(MetadataFromTitlePP, self).__init__(downloader, ['title:%s' % titleformat])
self._titleformat = titleformat
self._titleregex = self._data[0]['regex']

View File

@@ -1,44 +0,0 @@
from __future__ import unicode_literals
import re
from .common import PostProcessor
class MetadataFromTitlePP(PostProcessor):
def __init__(self, downloader, titleformat):
super(MetadataFromTitlePP, self).__init__(downloader)
self._titleformat = titleformat
self._titleregex = (self.format_to_regex(titleformat)
if re.search(r'%\(\w+\)s', titleformat)
else titleformat)
def format_to_regex(self, fmt):
r"""
Converts a string like
'%(title)s - %(artist)s'
to a regex like
'(?P<title>.+)\ \-\ (?P<artist>.+)'
"""
lastpos = 0
regex = ''
# replace %(..)s with regex group and escape other string parts
for match in re.finditer(r'%\((\w+)\)s', fmt):
regex += re.escape(fmt[lastpos:match.start()])
regex += r'(?P<' + match.group(1) + '>.+)'
lastpos = match.end()
if lastpos < len(fmt):
regex += re.escape(fmt[lastpos:])
return regex
def run(self, info):
title = info['title']
match = re.match(self._titleregex, title)
if match is None:
self.to_screen('Could not interpret title of video as "%s"' % self._titleformat)
return [], info
for attribute, value in match.groupdict().items():
info[attribute] = value
self.to_screen('parsed %s: %s' % (attribute, value if value is not None else 'NA'))
return [], info

View File

@@ -0,0 +1,54 @@
from __future__ import unicode_literals
import os
import shutil
from .common import PostProcessor
from ..utils import (
encodeFilename,
make_dir,
PostProcessingError,
)
from ..compat import compat_str
class MoveFilesAfterDownloadPP(PostProcessor):
def __init__(self, downloader, files_to_move):
PostProcessor.__init__(self, downloader)
self.files_to_move = files_to_move
@classmethod
def pp_key(cls):
return 'MoveFiles'
def run(self, info):
dl_path, dl_name = os.path.split(encodeFilename(info['filepath']))
finaldir = info.get('__finaldir', dl_path)
finalpath = os.path.join(finaldir, dl_name)
self.files_to_move.update(info['__files_to_move'])
self.files_to_move[info['filepath']] = finalpath
for oldfile, newfile in self.files_to_move.items():
if not newfile:
newfile = os.path.join(finaldir, os.path.basename(encodeFilename(oldfile)))
oldfile, newfile = compat_str(oldfile), compat_str(newfile)
if os.path.abspath(encodeFilename(oldfile)) == os.path.abspath(encodeFilename(newfile)):
continue
if not os.path.exists(encodeFilename(oldfile)):
self.report_warning('File "%s" cannot be found' % oldfile)
continue
if os.path.exists(encodeFilename(newfile)):
if self.get_param('overwrites', True):
self.report_warning('Replacing existing file "%s"' % newfile)
os.remove(encodeFilename(newfile))
else:
self.report_warning(
'Cannot move file "%s" out of temporary directory since "%s" already exists. '
% (oldfile, newfile))
continue
make_dir(newfile, PostProcessingError)
self.to_screen('Moving file "%s" to "%s"' % (oldfile, newfile))
shutil.move(oldfile, newfile) # os.rename cannot move between volumes
info['filepath'] = compat_str(finalpath)
return [], info

View File

@@ -84,6 +84,7 @@ class SponSkrubPP(PostProcessor):
else:
msg = stderr.decode('utf-8', 'replace').strip() or stdout.decode('utf-8', 'replace').strip()
self.write_debug(msg, prefix=False)
msg = msg.split('\n')[-1]
line = 0 if msg[:12].lower() == 'unrecognised' else -1
msg = msg.split('\n')[line]
raise PostProcessingError(msg if msg else 'sponskrub failed with error code %s' % p.returncode)
return [], information

View File

@@ -16,6 +16,7 @@ import email.header
import errno
import functools
import gzip
import imp
import io
import itertools
import json
@@ -49,6 +50,7 @@ from .compat import (
compat_html_entities_html5,
compat_http_client,
compat_integer_types,
compat_numeric_types,
compat_kwargs,
compat_os_name,
compat_parse_qs,
@@ -3672,6 +3674,18 @@ def url_or_none(url):
return url if re.match(r'^(?:(?:https?|rt(?:m(?:pt?[es]?|fp)|sp[su]?)|mms|ftps?):)?//', url) else None
def strftime_or_none(timestamp, date_format, default=None):
datetime_object = None
try:
if isinstance(timestamp, compat_numeric_types): # unix timestamp
datetime_object = datetime.datetime.utcfromtimestamp(timestamp)
elif isinstance(timestamp, compat_str): # assume YYYYMMDD
datetime_object = datetime.datetime.strptime(timestamp, '%Y%m%d')
return datetime_object.strftime(date_format)
except (ValueError, TypeError, AttributeError):
return default
def parse_duration(s):
if not isinstance(s, compat_basestring):
return None
@@ -4155,7 +4169,18 @@ def qualities(quality_ids):
return q
DEFAULT_OUTTMPL = '%(title)s [%(id)s].%(ext)s'
DEFAULT_OUTTMPL = {
'default': '%(title)s [%(id)s].%(ext)s',
}
OUTTMPL_TYPES = {
'subtitle': None,
'thumbnail': None,
'description': 'description',
'annotation': 'annotations.xml',
'infojson': 'info.json',
'pl_description': 'description',
'pl_infojson': 'info.json',
}
def limit_length(s, length):
@@ -4656,12 +4681,35 @@ def cli_valueless_option(params, command_option, param, expected_value=True):
return [command_option] if param == expected_value else []
def cli_configuration_args(params, param, default=[]):
ex_args = params.get(param)
if ex_args is None:
return default
assert isinstance(ex_args, list)
return ex_args
def cli_configuration_args(params, arg_name, key, default=[], exe=None): # returns arg, for_compat
argdict = params.get(arg_name, {})
if isinstance(argdict, (list, tuple)): # for backward compatibility
return argdict, True
if argdict is None:
return default, False
assert isinstance(argdict, dict)
assert isinstance(key, compat_str)
key = key.lower()
args = exe_args = None
if exe is not None:
assert isinstance(exe, compat_str)
exe = exe.lower()
args = argdict.get('%s+%s' % (key, exe))
if args is None:
exe_args = argdict.get(exe)
if args is None:
args = argdict.get(key) if key != exe else None
if args is None and exe_args is None:
args = argdict.get('default', default)
args, exe_args = args or [], exe_args or []
assert isinstance(args, (list, tuple))
assert isinstance(exe_args, (list, tuple))
return args + exe_args, False
class ISO639Utils(object):
@@ -5863,3 +5911,61 @@ def clean_podcast_url(url):
st\.fm # https://podsights.com/docs/
)/e
)/''', '', url)
_HEX_TABLE = '0123456789abcdef'
def random_uuidv4():
return re.sub(r'[xy]', lambda x: _HEX_TABLE[random.randint(0, 15)], 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx')
def make_dir(path, to_screen=None):
try:
dn = os.path.dirname(path)
if dn and not os.path.exists(dn):
os.makedirs(dn)
return True
except (OSError, IOError) as err:
if callable(to_screen) is not None:
to_screen('unable to create directory ' + error_to_compat_str(err))
return False
def get_executable_path():
path = os.path.dirname(sys.argv[0])
if os.path.abspath(sys.argv[0]) != os.path.abspath(sys.executable): # Not packaged
path = os.path.join(path, '..')
return os.path.abspath(path)
def load_plugins(name, type, namespace):
plugin_info = [None]
classes = []
try:
plugin_info = imp.find_module(
name, [os.path.join(get_executable_path(), 'ytdlp_plugins')])
plugins = imp.load_module(name, *plugin_info)
for name in dir(plugins):
if not name.endswith(type):
continue
klass = getattr(plugins, name)
classes.append(klass)
namespace[name] = klass
except ImportError:
pass
finally:
if plugin_info[0] is not None:
plugin_info[0].close()
return classes
def traverse_dict(dictn, keys, casesense=True):
if not isinstance(dictn, dict):
return None
first_key = keys[0]
if not casesense:
dictn = {key.lower(): val for key, val in dictn.items()}
first_key = first_key.lower()
value = dictn.get(first_key, None)
return value if len(keys) < 2 else traverse_dict(value, keys[1:], casesense)

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2021.01.16'
__version__ = '2021.01.29'

View File

@@ -0,0 +1,2 @@
# flake8: noqa
from .sample import SamplePluginIE

View File

@@ -0,0 +1,12 @@
from __future__ import unicode_literals
from youtube_dlc.extractor.common import InfoExtractor
class SamplePluginIE(InfoExtractor):
_WORKING = False
IE_DESC = False
_VALID_URL = r'^sampleplugin:'
def _real_extract(self, url):
self.to_screen('URL "%s" sucessfully captured' % url)