1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2026-01-12 01:41:26 +00:00

Compare commits

...

85 Commits

Author SHA1 Message Date
github-actions[bot]
575753b9f3 Release 2025.08.20
Created by: bashonly

:ci skip all
2025-08-20 02:51:33 +00:00
bashonly
c2fc4f3e7f [cleanup] Misc (#13991)
Authored by: bashonly
2025-08-20 02:42:34 +00:00
bashonly
07247d6c20 [build] Add Windows ARM64 builds (#14003)
Adds yt-dlp_arm64.exe and yt-dlp_win_arm64.zip to release assets

Closes #13849
Authored by: bashonly
2025-08-20 02:28:00 +00:00
bashonly
f63a7e41d1 [ie/youtube] Add playback_wait extractor-arg
Authored by: bashonly
2025-08-19 21:22:00 -05:00
bashonly
7b8a8abb98 [ie/francetv:site] Fix extractor (#14082)
Closes #14072
Authored by: bashonly
2025-08-20 01:35:32 +00:00
bashonly
a97f4cb57e [ie/youtube] Handle required preroll waiting period (#14081)
Authored by: bashonly
2025-08-19 20:06:53 -05:00
bashonly
d154dc3dcf [ie/youtube] Remove default player params (#14081)
Closes #13930
Authored by: bashonly
2025-08-19 20:06:53 -05:00
bashonly
438d3f06b3 [fd] Support available_at format field (#13980)
Authored by: bashonly
2025-08-20 00:38:48 +00:00
CasperMcFadden95
74b4b3b005 [ie/faulio] Add extractor (#13907)
Authored by: CasperMcFadden95
2025-08-19 23:27:31 +00:00
e2dk4r
36e873822b [ie/puhutv] Fix playlists extraction (#11955)
Closes #5691
Authored by: e2dk4r
2025-08-19 23:20:07 +00:00
Arseniy D.
d3d1ac8eb2 [ie/steam] Fix extractor (#14008)
Closes #14000
Authored by: AzartX47
2025-08-19 23:14:20 +00:00
doe1080
86d74e5cf0 [ie/medialaan] Rework extractors (#14015)
Fixes MedialaanIE and VTMIE

Authored by: doe1080
2025-08-19 23:10:17 +00:00
Junyi Lou
6ca9165648 [ie/bilibili] Handle Bangumi redirection (#14038)
Closes #13924
Authored by: junyilou, grqz

Co-authored-by: _Grqz <173015200+grqz@users.noreply.github.com>
2025-08-19 23:06:49 +00:00
Pierre
82a1390204 [ie/svt] Extract forced subs under separate lang code (#14062)
Closes #14020
Authored by: PierreMesure
2025-08-19 22:57:46 +00:00
Runar Saur Modahl
7540aa1da1 [ie/NRKTVEpisode] Fix extractor (#14065)
Closes #14054
Authored by: runarmod
2025-08-19 22:45:55 +00:00
bashonly
35da8df4f8 [utils] Add improved jwt_encode function (#14071)
Also deprecates `jwt_encode_hs256`

Authored by: bashonly
2025-08-19 22:36:00 +00:00
bashonly
8df121ba59 [ie/mtv] Overhaul extractors (#14052)
Adds SouthParkComBrIE and SouthParkCoUkIE

Removes these extractors:
 - CMTIE: migrated to Paramount+
 - ComedyCentralTVIE: migrated to Paramount+
 - MTVDEIE: migrated to Paramount+
 - MTVItaliaIE: migrated to Paramount+
 - MTVItaliaProgrammaIE: migrated to Paramount+
 - MTVJapanIE: migrated to JP Services
 - MTVServicesEmbeddedIE: dead domain
 - MTVVideoIE: migrated to Paramount+
 - NickBrIE: redirects to landing page w/o any videos
 - NickDeIE: redirects to landing page w/o any videos
 - NickRuIE: redirects to landing page w/o any videos
 - BellatorIE: migrated to PFL
 - ParamountNetworkIE: migrated to Paramount+
 - SouthParkNlIE: site no longer exists
 - TVLandIE: migrated to Paramount+

Closes #169, Closes #1711, Closes #1712, Closes #2621, Closes #3167, Closes #3893, Closes #4552, Closes #4702, Closes #4928, Closes #5249, Closes #6156, Closes #8722, Closes #9896, Closes #10168, Closes #12765, Closes #13446, Closes #14009

Authored by: bashonly, doe1080, Randalix, seproDev

Co-authored-by: doe1080 <98906116+doe1080@users.noreply.github.com>
Co-authored-by: Randalix <23729538+Randalix@users.noreply.github.com>
Co-authored-by: sepro <sepro@sepr0.com>
2025-08-19 20:46:11 +00:00
bashonly
471a2b60e0 [ie/tiktok:user] Improve infinite loop prevention (#14077)
Fix edf55e8184

Closes #14076
Authored by: bashonly
2025-08-19 20:39:33 +00:00
bashonly
df0553153e [ie/youtube] Default to main player JS variant (#14079)
Authored by: bashonly
2025-08-19 19:28:15 +00:00
bashonly
7bc53ae799 [ie/youtube] Extract title and description from initial data (#14078)
Closes #13604
Authored by: bashonly
2025-08-19 19:27:17 +00:00
bashonly
d8200ff0a4 [ie/vimeo:album] Support embed-only and non-numeric albums (#14021)
Authored by: bashonly
2025-08-18 22:09:35 +00:00
bashonly
0f6b915822 [ie/vimeo:event] Fix extractor (#14064)
Closes #14059
Authored by: bashonly
2025-08-18 21:09:25 +00:00
doe1080
374ea049f5 [ie/niconico:live] Support age-restricted streams (#13549)
Authored by: doe1080
2025-08-18 17:43:40 +00:00
doe1080
6f4c1bb593 [cleanup] Remove dead extractors (#13996)
Removes ArkenaIE, PladformIE, VevoIE, VevoPlaylistIE

Authored by: doe1080
2025-08-18 16:34:32 +00:00
doe1080
c22660aed5 [ie/adobetv] Fix extractor (#13917)
Removes AdobeTVChannelIE, AdobeTVEmbedIE, AdobeTVIE, AdobeTVShowIE

Authored by: doe1080
2025-08-18 16:32:26 +00:00
bashonly
404bd889d0 [ie/weibo] Support more URLs and --no-playlist (#14035)
Authored by: bashonly
2025-08-16 23:02:04 +00:00
bashonly
edf55e8184 [ie/tiktok:user] Avoid infinite loop during extraction (#14032)
Closes #14031
Authored by: bashonly
2025-08-16 22:57:14 +00:00
bashonly
8a8861d538 [ie/youtube:tab] Fix playlists tab extraction (#14030)
Closes #14028
Authored by: bashonly
2025-08-16 22:55:21 +00:00
sepro
70f5669951 Warn against using -f mp4 (#13915)
Authored by: seproDev
2025-08-17 00:35:46 +02:00
doe1080
6ae3543d5a [ie] _rta_search: Do not assume age_limit is 0 (#13985)
Authored by: doe1080
2025-08-16 04:28:58 +00:00
doe1080
770119bdd1 [ie] Extract avif storyboard formats from MPD manifests (#14016)
Authored by: doe1080
2025-08-16 03:32:21 +00:00
Arseniy D.
8e3f8065af [ie/weibo] Fix extractors (#14012)
Closes #14012
Authored by: AzartX47, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-08-16 03:07:35 +00:00
bashonly
aea85d525e [build] Discontinue darwin_legacy_exe support (#13860)
* Removes "yt-dlp_macos_legacy" from release assets
* Discontinues executable support for macOS < 10.15

Closes #13856
Authored by: bashonly
2025-08-13 22:02:58 +00:00
bashonly
f2919bd28e [ie/youtube] Add es5 and es6 player JS variants (#14005)
Authored by: bashonly
2025-08-12 23:24:31 +00:00
bashonly
681ed2153d [build] Bump PyInstaller version to 6.15.0 for Windows (#14002)
Authored by: bashonly
2025-08-12 23:17:13 +00:00
bashonly
bdeb3eb3f2 [pp/XAttrMetadata] Only set "Where From" attribute on macOS (#13999)
Fix 3e918d825d

Closes #14004
Authored by: bashonly
2025-08-12 07:58:22 +00:00
github-actions[bot]
b7de89c910 Release 2025.08.11
Created by: bashonly

:ci skip all
2025-08-11 03:54:46 +00:00
sepro
5e4ceb35cf [cleanup] Misc (#13852)
Closes #13815
Authored by: seproDev, injust, bashonly

Co-authored-by: Justin Su <injustsu@gmail.com>
Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-08-11 03:18:28 +00:00
bashonly
e98695549e [rh:curl_cffi] Support curl_cffi 0.11.x, 0.12.x, 0.13.x (#13989)
Authored by: bashonly
2025-08-11 03:16:07 +00:00
bashonly
bf366517ef [ie/youtube] Update player params (#13979)
Closes #13930
Authored by: bashonly
2025-08-10 07:33:45 +00:00
bashonly
c76ce28e06 Deprecate linux_armv7l_exe support (#13978)
Ref: https://github.com/yt-dlp/yt-dlp/issues/13976

Authored by: bashonly
2025-08-10 06:53:10 +00:00
Simon Sawicki
e8d49b1c7f [ie/motherless] Fix extractor (#13960)
Authored by: Grub4K
2025-08-07 21:04:30 -07:00
Sojiroh
a6df5e8a58 [ie/YandexDisk] Support 360 URLs (#13935)
Closes #13887
Authored by: Sojiroh
2025-08-07 23:16:55 +02:00
bashonly
e8d2807296 [ie/digitalconcerthall] Fix formats extraction (#13948)
Closes #13925
Authored by: bashonly
2025-08-07 00:03:44 +00:00
bashonly
fe53ebe5b6 [fd/dash] Re-extract if using --load-info-json with --live-from-start (#13922)
Closes #13906
Authored by: bashonly
2025-08-06 20:08:34 +00:00
sepro
662af5bb83 Warn when yt-dlp is severely outdated (#13937)
Authored by: seproDev
2025-08-06 21:14:45 +02:00
bashonly
8175f3738f [rh:requests] Bump minimum required version of urllib3 to 2.0.2 (#13939)
- urllib3 1.26.x gives unexpected results with partial reads: https://github.com/urllib3/urllib3/issues/2128
- urllib3 2.0.0 and 2.0.1 were yanked from PyPI: https://github.com/urllib3/urllib3/issues/3009

Closes #13927
Authored by: bashonly
2025-08-06 19:00:53 +00:00
sepro
1e0c77ddcc [pp/XAttrMetadata] Don't write "Where from" on Windows (#13944)
Fix 3e918d825d

Closes #13942
Authored by: seproDev
2025-08-06 16:52:34 +02:00
sepro
e651a53a2f Revert f799a4b472 2025-08-05 22:02:13 +02:00
sepro
f799a4b472 [ie/youtube] Update tv client config (#13934)
Closes #13930
Authored by: seproDev
2025-08-05 18:47:37 +02:00
coletdjnz
38c2bf4026 [ie/youtube] Add player params to mweb client (#13914)
Authored by: coletdjnz
2025-08-03 13:07:06 +12:00
Iuri Campos
6ff135c319 [ie/shiey] Add extractor (#13354)
Closes #12129
Authored by: iribeirocampos
2025-08-03 00:05:40 +02:00
JChris246
cd31c319e3 [ie/fc2] Fix old video support (#12633)
Closes #11778
Authored by: JChris246, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-08-02 23:37:35 +02:00
u-spec-png
6539ee1947 [ie/N1Info:article] Fix extractor (#13865)
Authored by: u-spec-png
2025-08-02 20:10:40 +00:00
CasperMcFadden95
43dedbe639 [ie/RoyaLive] Support en URLs (#13908)
Authored by: CasperMcFadden95
2025-08-02 19:59:30 +00:00
doe1080
05e553e9d1 [ie/niconico] Fix error handling & improve metadata extraction (#13240)
Closes #13338
Authored by: doe1080
2025-08-02 19:55:08 +00:00
doe1080
1c6068af99 [cleanup] Move embed tests to dedicated extractors (#13782)
Authored by: doe1080
2025-08-01 20:50:20 +00:00
garret1317
71f30921a2 [ie/tbsjp] Fix extractor (#13485)
Closes #13484
Authored by: garret1317
2025-07-31 20:33:05 +00:00
Abdulmohsen
121647705a [ie/TVer] Support --ignore-no-formats-error when geo-blocked (#13598)
Authored by: arabcoders
2025-07-30 23:23:06 +00:00
bashonly
70d7687487 [ie/TVer] Extract Streaks API info (#13885)
Closes #13874
Authored by: bashonly
2025-07-30 23:15:59 +00:00
bashonly
42ca3d601e [ie/archive.org] Fix metadata extraction (#13880)
Closes #13881
Authored by: bashonly
2025-07-30 06:11:09 +00:00
bashonly
62e2a9c0d5 [ci] Bump supported PyPy version to 3.11 (#13877)
Ref: https://pypy.org/posts/2025/07/pypy-v7320-release.html

Authored by: bashonly
2025-07-29 21:31:35 +00:00
bashonly
28b68f6875 [cookies] Load cookies with float expires timestamps (#13873)
Authored by: bashonly
2025-07-29 19:47:28 +00:00
fries1234
682334e4b3 [ie/tvw:news] Add extractor (#12907)
Authored by: fries1234
2025-07-27 22:26:33 +02:00
Florentin Le Moal
b831406a1d [ie/rtve.es:program] Add extractor
Authored by: meGAmeS1, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-07-27 21:52:05 +02:00
bashonly
23c658b9cb Raise minimum recommended Python version to 3.10 (#13859)
Ref: https://github.com/yt-dlp/yt-dlp/issues/13858

Authored by: bashonly
2025-07-26 22:59:02 +00:00
bashonly
cc5a5caac5 Deprecate darwin_legacy_exe support (#13857)
Ref: https://github.com/yt-dlp/yt-dlp/issues/13856

Authored by: bashonly
2025-07-26 22:12:53 +00:00
bashonly
66aa21dc5a [build] Use macos-14 runner for macos builds (#13814)
Ref: https://github.blog/changelog/2025-07-11-upcoming-changes-to-macos-hosted-runners-macos-latest-migration-and-xcode-support-policy-updates/#macos-13-is-closing-down

Authored by: bashonly
2025-07-26 19:39:54 +00:00
Tom Hebb
57186f958f [fd/hls] Fix --hls-split-continuity support (#13321)
Authored by: tchebb
2025-07-26 20:43:38 +02:00
CasperMcFadden95
daa1859be1 [ie/FaulioLive] Support Bahry TV (#13850)
Authored by: CasperMcFadden95
2025-07-26 20:11:57 +02:00
c-basalt
e8c2bf798b [ie/neteasemusic] Support XFF (#11044)
Closes #11043
Authored by: c-basalt
2025-07-26 20:02:56 +02:00
doe1080
1fe83b0111 [ie/eagleplatform] Remove extractors (#13469)
Authored by: doe1080
2025-07-26 17:34:22 +02:00
InvalidUsernameException
30302df22b [ie/sportdeuschland] Support embedded player URLs (#13833)
Closes #13766
Authored by: InvalidUsernameException
2025-07-25 22:22:32 +00:00
CasperMcFadden95
3e609b2ced [ie/FaulioLive] Add extractor (#13421)
Authored by: CasperMcFadden95, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-07-25 23:33:49 +02:00
bashonly
d399505fdf [fd/external] Work around ffmpeg's file: URL handling (#13844)
Closes #13781
Authored by: bashonly
2025-07-25 19:44:39 +00:00
sepro
61d4cd0bc0 [ie/PlyrEmbed] Add extractor (#13836)
Closes #13827
Authored by: seproDev
2025-07-25 20:55:41 +02:00
doe1080
4385480795 [utils] parse_resolution: Support width-only pattern (#13802)
Authored by: doe1080
2025-07-25 20:41:21 +02:00
Barry van Oudtshoorn
485de69dbf [ie/Parlview] Rework extractor (#13788)
Closes #13787
Authored by: barryvan
2025-07-25 06:00:31 +02:00
ischmidt20
0adeb1e54b [ie/tbs] Fix truTV support (#9683)
Closes #3400
Authored by: ischmidt20, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-07-24 22:35:48 +00:00
bashonly
afaf60d9fd [ie/vimeo] Fix login support and require authentication (#13823)
Closes #13822
Authored by: bashonly
2025-07-23 23:27:20 +00:00
Atsushi2965
7e3f48d64d [pp/EmbedThumbnail] Fix ffmpeg args for embedding in mp3 (#13720)
Authored by: atsushi2965
2025-07-22 21:55:00 +00:00
bashonly
59765ecbc0 [ie/sproutvideo] Fix extractor (#13813)
Authored by: bashonly
2025-07-22 21:46:46 +00:00
bashonly
c59ad2b066 [utils] random_user_agent: Bump versions (#13543)
Closes #5362
Authored by: bashonly
2025-07-22 21:34:03 +00:00
Simon Sawicki
eed94c7306 [utils] Add WINDOWS_VT_MODE to globals (#12460)
Authored by: Grub4K
2025-07-22 20:10:51 +02:00
Roland Crosby
3e918d825d [pp/XAttrMetadata] Add macOS "Where from" attribute (#12664)
Authored by: rolandcrosby
2025-07-22 19:50:42 +02:00
159 changed files with 8172 additions and 9370 deletions

View File

@@ -21,15 +21,9 @@ on:
macos:
default: true
type: boolean
macos_legacy:
default: true
type: boolean
windows:
default: true
type: boolean
windows32:
default: true
type: boolean
origin:
required: false
default: ''
@@ -67,16 +61,8 @@ on:
description: yt-dlp_macos, yt-dlp_macos.zip
default: true
type: boolean
macos_legacy:
description: yt-dlp_macos_legacy
default: true
type: boolean
windows:
description: yt-dlp.exe, yt-dlp_win.zip
default: true
type: boolean
windows32:
description: yt-dlp_x86.exe
description: yt-dlp.exe, yt-dlp_win.zip, yt-dlp_x86.exe, yt-dlp_win_x86.zip, yt-dlp_arm64.exe, yt-dlp_win_arm64.zip
default: true
type: boolean
origin:
@@ -208,7 +194,7 @@ jobs:
python3.9 -m pip install -U pip wheel 'setuptools>=71.0.2'
# XXX: Keep this in sync with pyproject.toml (it can't be accessed at this stage) and exclude secretstorage
python3.9 -m pip install -U Pyinstaller mutagen pycryptodomex brotli certifi cffi \
'requests>=2.32.2,<3' 'urllib3>=1.26.17,<3' 'websockets>=13.0'
'requests>=2.32.2,<3' 'urllib3>=2.0.2,<3' 'websockets>=13.0'
run: |
cd repo
@@ -242,7 +228,7 @@ jobs:
permissions:
contents: read
actions: write # For cleaning up cache
runs-on: macos-13
runs-on: macos-14
steps:
- uses: actions/checkout@v4
@@ -261,6 +247,8 @@ jobs:
- name: Install Requirements
run: |
brew install coreutils
# We need to use system Python in order to roll our own universal2 curl_cffi wheel
brew uninstall --ignore-dependencies python3
python3 -m venv ~/yt-dlp-build-venv
source ~/yt-dlp-build-venv/bin/activate
python3 devscripts/install_deps.py -o --include build
@@ -342,73 +330,56 @@ jobs:
~/yt-dlp-build-venv
key: cache-reqs-${{ github.job }}-${{ github.ref }}
macos_legacy:
needs: process
if: inputs.macos_legacy
runs-on: macos-13
steps:
- uses: actions/checkout@v4
- name: Install Python
# We need the official Python, because the GA ones only support newer macOS versions
env:
PYTHON_VERSION: 3.10.5
MACOSX_DEPLOYMENT_TARGET: 10.9 # Used up by the Python build tools
run: |
# Hack to get the latest patch version. Uncomment if needed
#brew install python@3.10
#export PYTHON_VERSION=$( $(brew --prefix)/opt/python@3.10/bin/python3 --version | cut -d ' ' -f 2 )
curl "https://www.python.org/ftp/python/${PYTHON_VERSION}/python-${PYTHON_VERSION}-macos11.pkg" -o "python.pkg"
sudo installer -pkg python.pkg -target /
python3 --version
- name: Install Requirements
run: |
brew install coreutils
python3 devscripts/install_deps.py --user -o --include build
python3 devscripts/install_deps.py --user --include pyinstaller
- name: Prepare
run: |
python3 devscripts/update-version.py -c "${{ inputs.channel }}" -r "${{ needs.process.outputs.origin }}" "${{ inputs.version }}"
python3 devscripts/make_lazy_extractors.py
- name: Build
run: |
python3 -m bundle.pyinstaller
mv dist/yt-dlp_macos dist/yt-dlp_macos_legacy
- name: Verify --update-to
if: vars.UPDATE_TO_VERIFICATION
run: |
chmod +x ./dist/yt-dlp_macos_legacy
cp ./dist/yt-dlp_macos_legacy ./dist/yt-dlp_macos_legacy_downgraded
version="$(./dist/yt-dlp_macos_legacy --version)"
./dist/yt-dlp_macos_legacy_downgraded -v --update-to yt-dlp/yt-dlp@2023.03.04
downgraded_version="$(./dist/yt-dlp_macos_legacy_downgraded --version)"
[[ "$version" != "$downgraded_version" ]]
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: build-bin-${{ github.job }}
path: |
dist/yt-dlp_macos_legacy
compression-level: 0
windows:
needs: process
if: inputs.windows
runs-on: windows-latest
permissions:
contents: read
actions: write # For cleaning up cache
runs-on: ${{ matrix.runner }}
strategy:
fail-fast: false
matrix:
include:
- arch: 'x64'
runner: windows-2025
suffix: ''
python_version: '3.10'
- arch: 'x86'
runner: windows-2025
suffix: '_x86'
python_version: '3.10'
- arch: 'arm64'
runner: windows-11-arm
suffix: '_arm64'
python_version: '3.13' # arm64 only has Python >= 3.11 available
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.10"
python-version: ${{ matrix.python_version }}
architecture: ${{ matrix.arch }}
- name: Restore cached requirements
id: restore-cache
if: matrix.arch == 'arm64'
uses: actions/cache/restore@v4
env:
SEGMENT_DOWNLOAD_TIMEOUT_MINS: 1
with:
path: |
/yt-dlp-build-venv
key: cache-reqs-${{ github.job }}_${{ matrix.arch }}-${{ matrix.python_version }}-${{ github.ref }}
- name: Install Requirements
run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
run: |
python -m venv /yt-dlp-build-venv
/yt-dlp-build-venv/Scripts/Activate.ps1
python devscripts/install_deps.py -o --include build
python devscripts/install_deps.py --include curl-cffi
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-6.13.0-py3-none-any.whl"
python devscripts/install_deps.py ${{ (matrix.arch != 'x86' && '--include curl-cffi') || '' }}
# Use custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/${{ matrix.arch }}/pyinstaller-6.15.0-py3-none-any.whl"
- name: Prepare
run: |
@@ -416,14 +387,18 @@ jobs:
python devscripts/make_lazy_extractors.py
- name: Build
run: |
/yt-dlp-build-venv/Scripts/Activate.ps1
python -m bundle.pyinstaller
python -m bundle.pyinstaller --onedir
Compress-Archive -Path ./dist/yt-dlp/* -DestinationPath ./dist/yt-dlp_win.zip
Compress-Archive -Path ./dist/yt-dlp${{ matrix.suffix }}/* -DestinationPath ./dist/yt-dlp_win${{ matrix.suffix }}.zip
- name: Verify --update-to
if: vars.UPDATE_TO_VERIFICATION
# if: vars.UPDATE_TO_VERIFICATION
# Temporarily skip for arm64 until there is a release that it can --update-to
if: |
vars.UPDATE_TO_VERIFICATION && matrix.arch != 'arm64'
run: |
foreach ($name in @("yt-dlp")) {
foreach ($name in @("yt-dlp${{ matrix.suffix }}")) {
Copy-Item "./dist/${name}.exe" "./dist/${name}_downgraded.exe"
$version = & "./dist/${name}.exe" --version
& "./dist/${name}_downgraded.exe" -v --update-to yt-dlp/yt-dlp@2023.03.04
@@ -433,61 +408,43 @@ jobs:
}
}
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: build-bin-${{ github.job }}
path: |
dist/yt-dlp.exe
dist/yt-dlp_win.zip
compression-level: 0
windows32:
needs: process
if: inputs.windows32
runs-on: windows-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.10"
architecture: "x86"
- name: Install Requirements
# TODO: remove when there is a windows_arm64 release that we can --update-to
- name: Verify arm64 executable
if: |
vars.UPDATE_TO_VERIFICATION && matrix.arch == 'arm64'
run: |
python devscripts/install_deps.py -o --include build
python devscripts/install_deps.py
python -m pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-6.13.0-py3-none-any.whl"
- name: Prepare
run: |
python devscripts/update-version.py -c "${{ inputs.channel }}" -r "${{ needs.process.outputs.origin }}" "${{ inputs.version }}"
python devscripts/make_lazy_extractors.py
- name: Build
run: |
python -m bundle.pyinstaller
- name: Verify --update-to
if: vars.UPDATE_TO_VERIFICATION
run: |
foreach ($name in @("yt-dlp_x86")) {
Copy-Item "./dist/${name}.exe" "./dist/${name}_downgraded.exe"
$version = & "./dist/${name}.exe" --version
& "./dist/${name}_downgraded.exe" -v --update-to yt-dlp/yt-dlp@2023.03.04
$downgraded_version = & "./dist/${name}_downgraded.exe" --version
if ($version -eq $downgraded_version) {
exit 1
}
foreach ($name in @("yt-dlp${{ matrix.suffix }}")) {
& "./dist/${name}.exe" -v --print-traffic --impersonate chrome "https://tls.browserleaks.com/json" -o ./resp.json
& cat ./resp.json
& "./dist/${name}.exe" --version
}
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: build-bin-${{ github.job }}
name: build-bin-${{ github.job }}-${{ matrix.arch }}
path: |
dist/yt-dlp_x86.exe
dist/yt-dlp${{ matrix.suffix }}.exe
dist/yt-dlp_win${{ matrix.suffix }}.zip
compression-level: 0
- name: Cleanup cache
if: |
matrix.arch == 'arm64' && steps.restore-cache.outputs.cache-hit == 'true'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
cache_key: cache-reqs-${{ github.job }}_${{ matrix.arch }}-${{ matrix.python_version }}-${{ github.ref }}
run: |
gh cache delete "${cache_key}"
- name: Cache requirements
if: matrix.arch == 'arm64'
uses: actions/cache/save@v4
with:
path: |
/yt-dlp-build-venv
key: cache-reqs-${{ github.job }}_${{ matrix.arch }}-${{ matrix.python_version }}-${{ github.ref }}
meta_files:
if: always() && !cancelled()
needs:
@@ -496,9 +453,7 @@ jobs:
- linux_static
- linux_arm
- macos
- macos_legacy
- windows
- windows32
runs-on: ubuntu-latest
steps:
- name: Download artifacts
@@ -528,27 +483,31 @@ jobs:
lock 2023.11.16 win_x86_exe .+ Windows-(?:Vista|2008Server)
lock 2024.10.22 py2exe .+
lock 2024.10.22 linux_(?:armv7l|aarch64)_exe .+-glibc2\.(?:[12]?\d|30)\b
lock 2024.10.22 (?!\w+_exe).+ Python 3\.8
lock 2024.10.22 zip Python 3\.8
lock 2024.10.22 win(?:_x86)?_exe Python 3\.[78].+ Windows-(?:7-|2008ServerR2)
lock 2025.08.11 darwin_legacy_exe .+
lockV2 yt-dlp/yt-dlp 2022.08.18.36 .+ Python 3\.6
lockV2 yt-dlp/yt-dlp 2023.11.16 (?!win_x86_exe).+ Python 3\.7
lockV2 yt-dlp/yt-dlp 2023.11.16 win_x86_exe .+ Windows-(?:Vista|2008Server)
lockV2 yt-dlp/yt-dlp 2024.10.22 py2exe .+
lockV2 yt-dlp/yt-dlp 2024.10.22 linux_(?:armv7l|aarch64)_exe .+-glibc2\.(?:[12]?\d|30)\b
lockV2 yt-dlp/yt-dlp 2024.10.22 (?!\w+_exe).+ Python 3\.8
lockV2 yt-dlp/yt-dlp 2024.10.22 zip Python 3\.8
lockV2 yt-dlp/yt-dlp 2024.10.22 win(?:_x86)?_exe Python 3\.[78].+ Windows-(?:7-|2008ServerR2)
lockV2 yt-dlp/yt-dlp 2025.08.11 darwin_legacy_exe .+
lockV2 yt-dlp/yt-dlp-nightly-builds 2023.11.15.232826 (?!win_x86_exe).+ Python 3\.7
lockV2 yt-dlp/yt-dlp-nightly-builds 2023.11.15.232826 win_x86_exe .+ Windows-(?:Vista|2008Server)
lockV2 yt-dlp/yt-dlp-nightly-builds 2024.10.22.051025 py2exe .+
lockV2 yt-dlp/yt-dlp-nightly-builds 2024.10.22.051025 linux_(?:armv7l|aarch64)_exe .+-glibc2\.(?:[12]?\d|30)\b
lockV2 yt-dlp/yt-dlp-nightly-builds 2024.10.22.051025 (?!\w+_exe).+ Python 3\.8
lockV2 yt-dlp/yt-dlp-nightly-builds 2024.10.22.051025 zip Python 3\.8
lockV2 yt-dlp/yt-dlp-nightly-builds 2024.10.22.051025 win(?:_x86)?_exe Python 3\.[78].+ Windows-(?:7-|2008ServerR2)
lockV2 yt-dlp/yt-dlp-nightly-builds 2025.08.12.233030 darwin_legacy_exe .+
lockV2 yt-dlp/yt-dlp-master-builds 2023.11.15.232812 (?!win_x86_exe).+ Python 3\.7
lockV2 yt-dlp/yt-dlp-master-builds 2023.11.15.232812 win_x86_exe .+ Windows-(?:Vista|2008Server)
lockV2 yt-dlp/yt-dlp-master-builds 2024.10.22.045052 py2exe .+
lockV2 yt-dlp/yt-dlp-master-builds 2024.10.22.060347 linux_(?:armv7l|aarch64)_exe .+-glibc2\.(?:[12]?\d|30)\b
lockV2 yt-dlp/yt-dlp-master-builds 2024.10.22.060347 (?!\w+_exe).+ Python 3\.8
lockV2 yt-dlp/yt-dlp-master-builds 2024.10.22.060347 zip Python 3\.8
lockV2 yt-dlp/yt-dlp-master-builds 2024.10.22.060347 win(?:_x86)?_exe Python 3\.[78].+ Windows-(?:7-|2008ServerR2)
lockV2 yt-dlp/yt-dlp-master-builds 2025.08.12.232447 darwin_legacy_exe .+
EOF
- name: Sign checksum files

View File

@@ -37,7 +37,7 @@ jobs:
matrix:
os: [ubuntu-latest]
# CPython 3.9 is in quick-test
python-version: ['3.10', '3.11', '3.12', '3.13', pypy-3.10]
python-version: ['3.10', '3.11', '3.12', '3.13', pypy-3.11]
include:
# atleast one of each CPython/PyPy tests must be in windows
- os: windows-latest
@@ -49,7 +49,7 @@ jobs:
- os: windows-latest
python-version: '3.13'
- os: windows-latest
python-version: pypy-3.10
python-version: pypy-3.11
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}

View File

@@ -28,13 +28,13 @@ jobs:
fail-fast: true
matrix:
os: [ubuntu-latest]
python-version: ['3.10', '3.11', '3.12', '3.13', pypy-3.10]
python-version: ['3.10', '3.11', '3.12', '3.13', pypy-3.11]
include:
# atleast one of each CPython/PyPy tests must be in windows
- os: windows-latest
python-version: '3.9'
- os: windows-latest
python-version: pypy-3.10
python-version: pypy-3.11
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}

View File

@@ -25,7 +25,7 @@ jobs:
fail-fast: false
matrix:
os: [ubuntu-latest, windows-latest]
python-version: ['3.9', '3.10', '3.11', '3.12', '3.13', pypy-3.10, pypy-3.11]
python-version: ['3.9', '3.10', '3.11', '3.12', '3.13', pypy-3.11]
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}

View File

@@ -272,7 +272,7 @@ After you have ensured this site is distributing its content legally, you can fo
You can use `hatch fmt` to automatically fix problems. Rules that the linter/formatter enforces should not be disabled with `# noqa` unless a maintainer requests it. The only exception allowed is for old/printf-style string formatting in GraphQL query templates (use `# noqa: UP031`).
1. Make sure your code works under all [Python](https://www.python.org/) versions supported by yt-dlp, namely CPython >=3.9 and PyPy >=3.10. Backward compatibility is not required for even older versions of Python.
1. Make sure your code works under all [Python](https://www.python.org/) versions supported by yt-dlp, namely CPython >=3.9 and PyPy >=3.11. Backward compatibility is not required for even older versions of Python.
1. When the tests pass, [add](https://git-scm.com/docs/git-add) the new files, [commit](https://git-scm.com/docs/git-commit) them and [push](https://git-scm.com/docs/git-push) the result, like this:
```shell

View File

@@ -4,6 +4,7 @@ coletdjnz/colethedj (collaborator)
Ashish0804 (collaborator)
bashonly (collaborator)
Grub4K (collaborator)
seproDev (collaborator)
h-h-h-h
pauldubois98
nixxo
@@ -403,7 +404,6 @@ rebane2001
road-master
rohieb
sdht0
seproDev
Hill-98
LXYan2333
mushbite
@@ -793,3 +793,16 @@ moonshinerd
R0hanW
ShockedPlot7560
swayll
atsushi2965
barryvan
injust
iribeirocampos
rolandcrosby
Sojiroh
tchebb
AzartX47
e2dk4r
junyilou
PierreMesure
Randalix
runarmod

View File

@@ -4,6 +4,130 @@
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
-->
### 2025.08.20
#### Core changes
- [Warn against using `-f mp4`](https://github.com/yt-dlp/yt-dlp/commit/70f56699515e0854a4853d214dce11b61d432387) ([#13915](https://github.com/yt-dlp/yt-dlp/issues/13915)) by [seproDev](https://github.com/seproDev)
- **utils**: [Add improved `jwt_encode` function](https://github.com/yt-dlp/yt-dlp/commit/35da8df4f843cb8f0656a301e5bebbf47d64d69a) ([#14071](https://github.com/yt-dlp/yt-dlp/issues/14071)) by [bashonly](https://github.com/bashonly)
#### Extractor changes
- [Extract avif storyboard formats from MPD manifests](https://github.com/yt-dlp/yt-dlp/commit/770119bdd15c525ba4338503f0eb68ea4baedf10) ([#14016](https://github.com/yt-dlp/yt-dlp/issues/14016)) by [doe1080](https://github.com/doe1080)
- `_rta_search`: [Do not assume `age_limit` is `0`](https://github.com/yt-dlp/yt-dlp/commit/6ae3543d5a1feea0c546571fd2782b024c108eac) ([#13985](https://github.com/yt-dlp/yt-dlp/issues/13985)) by [doe1080](https://github.com/doe1080)
- **adobetv**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/c22660aed5fadb4ac29bdf25db4e8016414153cc) ([#13917](https://github.com/yt-dlp/yt-dlp/issues/13917)) by [doe1080](https://github.com/doe1080)
- **bilibili**: [Handle Bangumi redirection](https://github.com/yt-dlp/yt-dlp/commit/6ca9165648ac9a07c012de639faf50a97cbe0991) ([#14038](https://github.com/yt-dlp/yt-dlp/issues/14038)) by [grqz](https://github.com/grqz), [junyilou](https://github.com/junyilou)
- **faulio**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/74b4b3b00516e92a60250e0626272a6826459057) ([#13907](https://github.com/yt-dlp/yt-dlp/issues/13907)) by [CasperMcFadden95](https://github.com/CasperMcFadden95)
- **francetv**: site: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/7b8a8abb98165a53c026e2a3f52faee608df1f20) ([#14082](https://github.com/yt-dlp/yt-dlp/issues/14082)) by [bashonly](https://github.com/bashonly)
- **medialaan**: [Rework extractors](https://github.com/yt-dlp/yt-dlp/commit/86d74e5cf0e06c53c931ccdbdd497e3f2c4d2fe2) ([#14015](https://github.com/yt-dlp/yt-dlp/issues/14015)) by [doe1080](https://github.com/doe1080)
- **mtv**: [Overhaul extractors](https://github.com/yt-dlp/yt-dlp/commit/8df121ba59208979aa713822781891347abd03d1) ([#14052](https://github.com/yt-dlp/yt-dlp/issues/14052)) by [bashonly](https://github.com/bashonly), [doe1080](https://github.com/doe1080), [Randalix](https://github.com/Randalix), [seproDev](https://github.com/seproDev)
- **niconico**: live: [Support age-restricted streams](https://github.com/yt-dlp/yt-dlp/commit/374ea049f531959bcccf8a1e6bc5659d228a780e) ([#13549](https://github.com/yt-dlp/yt-dlp/issues/13549)) by [doe1080](https://github.com/doe1080)
- **nrktvepisode**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/7540aa1da1800769af40381f423825a1a8826377) ([#14065](https://github.com/yt-dlp/yt-dlp/issues/14065)) by [runarmod](https://github.com/runarmod)
- **puhutv**: [Fix playlists extraction](https://github.com/yt-dlp/yt-dlp/commit/36e873822bdb2c5aba3780dd3ae32cbae564c6cd) ([#11955](https://github.com/yt-dlp/yt-dlp/issues/11955)) by [e2dk4r](https://github.com/e2dk4r)
- **steam**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/d3d1ac8eb2f9e96f3d75292e0effe2b1bccece3b) ([#14008](https://github.com/yt-dlp/yt-dlp/issues/14008)) by [AzartX47](https://github.com/AzartX47)
- **svt**: [Extract forced subs under separate lang code](https://github.com/yt-dlp/yt-dlp/commit/82a139020417a501f261d9fe02cefca01b1e12e4) ([#14062](https://github.com/yt-dlp/yt-dlp/issues/14062)) by [PierreMesure](https://github.com/PierreMesure)
- **tiktok**: user: [Avoid infinite loop during extraction](https://github.com/yt-dlp/yt-dlp/commit/edf55e81842fcfa6c302528d7f33ccd5081b37ef) ([#14032](https://github.com/yt-dlp/yt-dlp/issues/14032)) by [bashonly](https://github.com/bashonly) (With fixes in [471a2b6](https://github.com/yt-dlp/yt-dlp/commit/471a2b60e0a3e056960d9ceb1ebf57908428f752))
- **vimeo**
- album: [Support embed-only and non-numeric albums](https://github.com/yt-dlp/yt-dlp/commit/d8200ff0a4699e06c9f7daca8f8531f8b98e68f2) ([#14021](https://github.com/yt-dlp/yt-dlp/issues/14021)) by [bashonly](https://github.com/bashonly)
- event: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/0f6b915822fb64bd944126fdacd401975c9f06ed) ([#14064](https://github.com/yt-dlp/yt-dlp/issues/14064)) by [bashonly](https://github.com/bashonly)
- **weibo**
- [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/8e3f8065af1415caeff788c5c430703dd0d8f576) ([#14012](https://github.com/yt-dlp/yt-dlp/issues/14012)) by [AzartX47](https://github.com/AzartX47), [bashonly](https://github.com/bashonly)
- [Support more URLs and --no-playlist](https://github.com/yt-dlp/yt-dlp/commit/404bd889d0e0b62ad72b7281e3fefdc0497080b3) ([#14035](https://github.com/yt-dlp/yt-dlp/issues/14035)) by [bashonly](https://github.com/bashonly)
- **youtube**
- [Add `es5` and `es6` player JS variants](https://github.com/yt-dlp/yt-dlp/commit/f2919bd28eac905f1267c62b83738a02bb5b4e04) ([#14005](https://github.com/yt-dlp/yt-dlp/issues/14005)) by [bashonly](https://github.com/bashonly)
- [Add `playback_wait` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/f63a7e41d120ef84f0f2274b0962438e3272d2fa) by [bashonly](https://github.com/bashonly)
- [Default to `main` player JS variant](https://github.com/yt-dlp/yt-dlp/commit/df0553153e41f81e3b30aa5bb1d119c61bd449ac) ([#14079](https://github.com/yt-dlp/yt-dlp/issues/14079)) by [bashonly](https://github.com/bashonly)
- [Extract title and description from initial data](https://github.com/yt-dlp/yt-dlp/commit/7bc53ae79930b36f4f947679545c75f36e9f0ddd) ([#14078](https://github.com/yt-dlp/yt-dlp/issues/14078)) by [bashonly](https://github.com/bashonly)
- [Handle required preroll waiting period](https://github.com/yt-dlp/yt-dlp/commit/a97f4cb57e61e19be61a7d5ac19665d4b567c960) ([#14081](https://github.com/yt-dlp/yt-dlp/issues/14081)) by [bashonly](https://github.com/bashonly)
- [Remove default player params](https://github.com/yt-dlp/yt-dlp/commit/d154dc3dcf0c7c75dbabb6cd1aca66fdd806f858) ([#14081](https://github.com/yt-dlp/yt-dlp/issues/14081)) by [bashonly](https://github.com/bashonly)
- tab: [Fix playlists tab extraction](https://github.com/yt-dlp/yt-dlp/commit/8a8861d53864c8a38e924bc0657ead5180f17268) ([#14030](https://github.com/yt-dlp/yt-dlp/issues/14030)) by [bashonly](https://github.com/bashonly)
#### Downloader changes
- [Support `available_at` format field](https://github.com/yt-dlp/yt-dlp/commit/438d3f06b3c41bdef8112d40b75d342186e91a16) ([#13980](https://github.com/yt-dlp/yt-dlp/issues/13980)) by [bashonly](https://github.com/bashonly)
#### Postprocessor changes
- **xattrmetadata**: [Only set "Where From" attribute on macOS](https://github.com/yt-dlp/yt-dlp/commit/bdeb3eb3f29eebbe8237fbc5186e51e7293eea4a) ([#13999](https://github.com/yt-dlp/yt-dlp/issues/13999)) by [bashonly](https://github.com/bashonly)
#### Misc. changes
- **build**
- [Add Windows ARM64 builds](https://github.com/yt-dlp/yt-dlp/commit/07247d6c20fef1ad13b6f71f6355a44d308cf010) ([#14003](https://github.com/yt-dlp/yt-dlp/issues/14003)) by [bashonly](https://github.com/bashonly)
- [Bump PyInstaller version to 6.15.0 for Windows](https://github.com/yt-dlp/yt-dlp/commit/681ed2153de754c2c885fdad09ab71fffa8114f9) ([#14002](https://github.com/yt-dlp/yt-dlp/issues/14002)) by [bashonly](https://github.com/bashonly)
- [Discontinue `darwin_legacy_exe` support](https://github.com/yt-dlp/yt-dlp/commit/aea85d525e1007bb64baec0e170c054292d0858a) ([#13860](https://github.com/yt-dlp/yt-dlp/issues/13860)) by [bashonly](https://github.com/bashonly)
- **cleanup**
- [Remove dead extractors](https://github.com/yt-dlp/yt-dlp/commit/6f4c1bb593da92f0ce68229d0c813cdbaf1314da) ([#13996](https://github.com/yt-dlp/yt-dlp/issues/13996)) by [doe1080](https://github.com/doe1080)
- Miscellaneous: [c2fc4f3](https://github.com/yt-dlp/yt-dlp/commit/c2fc4f3e7f6d757250183b177130c64beee50520) by [bashonly](https://github.com/bashonly)
### 2025.08.11
#### Important changes
- **The minimum *recommended* Python version has been raised to 3.10**
Since Python 3.9 will reach end-of-life in October 2025, support for it will be dropped soon. [Read more](https://github.com/yt-dlp/yt-dlp/issues/13858)
- **darwin_legacy_exe builds are being discontinued**
This release's `yt-dlp_macos_legacy` binary will likely be the last one. [Read more](https://github.com/yt-dlp/yt-dlp/issues/13856)
- **linux_armv7l_exe builds are being discontinued**
This release's `yt-dlp_linux_armv7l` binary could be the last one. [Read more](https://github.com/yt-dlp/yt-dlp/issues/13976)
#### Core changes
- [Deprecate `darwin_legacy_exe` support](https://github.com/yt-dlp/yt-dlp/commit/cc5a5caac5fbc0d605b52bde0778d6fd5f97b5ab) ([#13857](https://github.com/yt-dlp/yt-dlp/issues/13857)) by [bashonly](https://github.com/bashonly)
- [Deprecate `linux_armv7l_exe` support](https://github.com/yt-dlp/yt-dlp/commit/c76ce28e06c816eb5b261dfb6aff6e69dd9b7382) ([#13978](https://github.com/yt-dlp/yt-dlp/issues/13978)) by [bashonly](https://github.com/bashonly)
- [Raise minimum recommended Python version to 3.10](https://github.com/yt-dlp/yt-dlp/commit/23c658b9cbe34a151f8f921ab1320bb5d4e40a4d) ([#13859](https://github.com/yt-dlp/yt-dlp/issues/13859)) by [bashonly](https://github.com/bashonly)
- [Warn when yt-dlp is severely outdated](https://github.com/yt-dlp/yt-dlp/commit/662af5bb8307ec3ff8ab0857f1159922d64792f0) ([#13937](https://github.com/yt-dlp/yt-dlp/issues/13937)) by [seproDev](https://github.com/seproDev)
- **cookies**: [Load cookies with float `expires` timestamps](https://github.com/yt-dlp/yt-dlp/commit/28b68f687561468e0c664dcb430707458970019f) ([#13873](https://github.com/yt-dlp/yt-dlp/issues/13873)) by [bashonly](https://github.com/bashonly)
- **utils**
- [Add `WINDOWS_VT_MODE` to globals](https://github.com/yt-dlp/yt-dlp/commit/eed94c7306d4ecdba53ad8783b1463a9af5c97f1) ([#12460](https://github.com/yt-dlp/yt-dlp/issues/12460)) by [Grub4K](https://github.com/Grub4K)
- `parse_resolution`: [Support width-only pattern](https://github.com/yt-dlp/yt-dlp/commit/4385480795acda35667be008d0bf26b46e9d65b4) ([#13802](https://github.com/yt-dlp/yt-dlp/issues/13802)) by [doe1080](https://github.com/doe1080)
- `random_user_agent`: [Bump versions](https://github.com/yt-dlp/yt-dlp/commit/c59ad2b066bbccd3cc4eed580842f961bce7dd4a) ([#13543](https://github.com/yt-dlp/yt-dlp/issues/13543)) by [bashonly](https://github.com/bashonly)
#### Extractor changes
- **archive.org**: [Fix metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/42ca3d601ee10cef89d698f72e2b5d44fab4f013) ([#13880](https://github.com/yt-dlp/yt-dlp/issues/13880)) by [bashonly](https://github.com/bashonly)
- **digitalconcerthall**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/e8d2807296ccc603e031f5982623a8311f2a5119) ([#13948](https://github.com/yt-dlp/yt-dlp/issues/13948)) by [bashonly](https://github.com/bashonly)
- **eagleplatform**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/1fe83b0111277a6f214c5ec1819cfbf943508baf) ([#13469](https://github.com/yt-dlp/yt-dlp/issues/13469)) by [doe1080](https://github.com/doe1080)
- **fauliolive**
- [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/3e609b2cedd285739bf82c7af7853735092070a4) ([#13421](https://github.com/yt-dlp/yt-dlp/issues/13421)) by [CasperMcFadden95](https://github.com/CasperMcFadden95), [seproDev](https://github.com/seproDev)
- [Support Bahry TV](https://github.com/yt-dlp/yt-dlp/commit/daa1859be1b0e7d123da8b4e0988f2eb7bd47d15) ([#13850](https://github.com/yt-dlp/yt-dlp/issues/13850)) by [CasperMcFadden95](https://github.com/CasperMcFadden95)
- **fc2**: [Fix old video support](https://github.com/yt-dlp/yt-dlp/commit/cd31c319e3142622ec43c49485d196ed2835df05) ([#12633](https://github.com/yt-dlp/yt-dlp/issues/12633)) by [JChris246](https://github.com/JChris246), [seproDev](https://github.com/seproDev)
- **motherless**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/e8d49b1c7f11c7e282319395ca9c2a201304be41) ([#13960](https://github.com/yt-dlp/yt-dlp/issues/13960)) by [Grub4K](https://github.com/Grub4K)
- **n1info**: article: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/6539ee1947d7885d3606da6365fd858308435a63) ([#13865](https://github.com/yt-dlp/yt-dlp/issues/13865)) by [u-spec-png](https://github.com/u-spec-png)
- **neteasemusic**: [Support XFF](https://github.com/yt-dlp/yt-dlp/commit/e8c2bf798b6707d27fecde66161172da69c7cd72) ([#11044](https://github.com/yt-dlp/yt-dlp/issues/11044)) by [c-basalt](https://github.com/c-basalt)
- **niconico**: [Fix error handling & improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/05e553e9d1f57655d65c9811d05df38261601b85) ([#13240](https://github.com/yt-dlp/yt-dlp/issues/13240)) by [doe1080](https://github.com/doe1080)
- **parlview**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/485de69dbfeb7de7bcf9f7fe16d6c6ba9e81e1a0) ([#13788](https://github.com/yt-dlp/yt-dlp/issues/13788)) by [barryvan](https://github.com/barryvan)
- **plyrembed**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/61d4cd0bc01be6ebe11fd53c2d3805d1a2058990) ([#13836](https://github.com/yt-dlp/yt-dlp/issues/13836)) by [seproDev](https://github.com/seproDev)
- **royalive**: [Support `en` URLs](https://github.com/yt-dlp/yt-dlp/commit/43dedbe6394bdd489193b15ee9690a62d1b82d94) ([#13908](https://github.com/yt-dlp/yt-dlp/issues/13908)) by [CasperMcFadden95](https://github.com/CasperMcFadden95)
- **rtve.es**: program: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/b831406a1d3be34c159835079d12bae624c43610) ([#12955](https://github.com/yt-dlp/yt-dlp/issues/12955)) by [meGAmeS1](https://github.com/meGAmeS1), [seproDev](https://github.com/seproDev)
- **shiey**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/6ff135c31914ea8b5545f8d187c60e852cfde9bc) ([#13354](https://github.com/yt-dlp/yt-dlp/issues/13354)) by [iribeirocampos](https://github.com/iribeirocampos)
- **sportdeuschland**: [Support embedded player URLs](https://github.com/yt-dlp/yt-dlp/commit/30302df22b7b431ce920e0f7298cd10be9989967) ([#13833](https://github.com/yt-dlp/yt-dlp/issues/13833)) by [InvalidUsernameException](https://github.com/InvalidUsernameException)
- **sproutvideo**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/59765ecbc08d18005de7143fbb1d1caf90239471) ([#13813](https://github.com/yt-dlp/yt-dlp/issues/13813)) by [bashonly](https://github.com/bashonly)
- **tbs**: [Fix truTV support](https://github.com/yt-dlp/yt-dlp/commit/0adeb1e54b2d7e95cd19999e71013877850f8f41) ([#9683](https://github.com/yt-dlp/yt-dlp/issues/9683)) by [bashonly](https://github.com/bashonly), [ischmidt20](https://github.com/ischmidt20)
- **tbsjp**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/71f30921a2023dbb25c53fd1bb1399cac803116d) ([#13485](https://github.com/yt-dlp/yt-dlp/issues/13485)) by [garret1317](https://github.com/garret1317)
- **tver**
- [Extract Streaks API info](https://github.com/yt-dlp/yt-dlp/commit/70d7687487252a08dbf8b2831743e7833472ba05) ([#13885](https://github.com/yt-dlp/yt-dlp/issues/13885)) by [bashonly](https://github.com/bashonly)
- [Support --ignore-no-formats-error when geo-blocked](https://github.com/yt-dlp/yt-dlp/commit/121647705a2fc6b968278723fe61801007e228a4) ([#13598](https://github.com/yt-dlp/yt-dlp/issues/13598)) by [arabcoders](https://github.com/arabcoders)
- **tvw**: news: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/682334e4b35112f7a5798decdcb5cb12230ef948) ([#12907](https://github.com/yt-dlp/yt-dlp/issues/12907)) by [fries1234](https://github.com/fries1234)
- **vimeo**: [Fix login support and require authentication](https://github.com/yt-dlp/yt-dlp/commit/afaf60d9fd5a0c7a85aeb1374fd97fbc13cd652c) ([#13823](https://github.com/yt-dlp/yt-dlp/issues/13823)) by [bashonly](https://github.com/bashonly)
- **yandexdisk**: [Support 360 URLs](https://github.com/yt-dlp/yt-dlp/commit/a6df5e8a58d6743dd230011389c986495ec509da) ([#13935](https://github.com/yt-dlp/yt-dlp/issues/13935)) by [Sojiroh](https://github.com/Sojiroh)
- **youtube**
- [Add player params to mweb client](https://github.com/yt-dlp/yt-dlp/commit/38c2bf40260f7788efb5a7f5e8eba8e5cb43f741) ([#13914](https://github.com/yt-dlp/yt-dlp/issues/13914)) by [coletdjnz](https://github.com/coletdjnz)
- [Update player params](https://github.com/yt-dlp/yt-dlp/commit/bf366517ef0b745490ee9e0f929254fa26b69647) ([#13979](https://github.com/yt-dlp/yt-dlp/issues/13979)) by [bashonly](https://github.com/bashonly)
#### Downloader changes
- **dash**: [Re-extract if using --load-info-json with --live-from-start](https://github.com/yt-dlp/yt-dlp/commit/fe53ebe5b66a03c664708a4d6fd87b8c13a1bc7b) ([#13922](https://github.com/yt-dlp/yt-dlp/issues/13922)) by [bashonly](https://github.com/bashonly)
- **external**: [Work around ffmpeg's `file:` URL handling](https://github.com/yt-dlp/yt-dlp/commit/d399505fdf8292332bdc91d33859a0b0d08104fd) ([#13844](https://github.com/yt-dlp/yt-dlp/issues/13844)) by [bashonly](https://github.com/bashonly)
- **hls**: [Fix `--hls-split-continuity` support](https://github.com/yt-dlp/yt-dlp/commit/57186f958f164daa50203adcbf7ec74d541151cf) ([#13321](https://github.com/yt-dlp/yt-dlp/issues/13321)) by [tchebb](https://github.com/tchebb)
#### Postprocessor changes
- **embedthumbnail**: [Fix ffmpeg args for embedding in mp3](https://github.com/yt-dlp/yt-dlp/commit/7e3f48d64d237281a97b3df1a61980c78a0302fe) ([#13720](https://github.com/yt-dlp/yt-dlp/issues/13720)) by [atsushi2965](https://github.com/atsushi2965)
- **xattrmetadata**: [Add macOS "Where from" attribute](https://github.com/yt-dlp/yt-dlp/commit/3e918d825d7ff367812658957b281b8cda8f9ebb) ([#12664](https://github.com/yt-dlp/yt-dlp/issues/12664)) by [rolandcrosby](https://github.com/rolandcrosby) (With fixes in [1e0c77d](https://github.com/yt-dlp/yt-dlp/commit/1e0c77ddcce335a1875ecc17d93ed6ff3fabd975) by [seproDev](https://github.com/seproDev))
#### Networking changes
- **Request Handler**
- curl_cffi: [Support `curl_cffi` 0.11.x, 0.12.x, 0.13.x](https://github.com/yt-dlp/yt-dlp/commit/e98695549e2eb8ce4a59abe16b5afa8adc075bbe) ([#13989](https://github.com/yt-dlp/yt-dlp/issues/13989)) by [bashonly](https://github.com/bashonly)
- requests: [Bump minimum required version of urllib3 to 2.0.2](https://github.com/yt-dlp/yt-dlp/commit/8175f3738fe4db3bc629d36bb72b927d4286d3f9) ([#13939](https://github.com/yt-dlp/yt-dlp/issues/13939)) by [bashonly](https://github.com/bashonly)
#### Misc. changes
- **build**: [Use `macos-14` runner for `macos` builds](https://github.com/yt-dlp/yt-dlp/commit/66aa21dc5a3b79059c38f3ad1d05dc9b29187701) ([#13814](https://github.com/yt-dlp/yt-dlp/issues/13814)) by [bashonly](https://github.com/bashonly)
- **ci**: [Bump supported PyPy version to 3.11](https://github.com/yt-dlp/yt-dlp/commit/62e2a9c0d55306906f18da2927e05e1cbc31473c) ([#13877](https://github.com/yt-dlp/yt-dlp/issues/13877)) by [bashonly](https://github.com/bashonly)
- **cleanup**
- [Move embed tests to dedicated extractors](https://github.com/yt-dlp/yt-dlp/commit/1c6068af997cfc0e28061fc00f4d6091e1de57da) ([#13782](https://github.com/yt-dlp/yt-dlp/issues/13782)) by [doe1080](https://github.com/doe1080)
- Miscellaneous: [5e4ceb3](https://github.com/yt-dlp/yt-dlp/commit/5e4ceb35cf997af0dbf100e1de37f4e2bcbaa0b7) by [bashonly](https://github.com/bashonly), [injust](https://github.com/injust), [seproDev](https://github.com/seproDev)
### 2025.07.21
#### Important changes

View File

@@ -106,12 +106,14 @@ File|Description
File|Description
:---|:---
[yt-dlp_x86.exe](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_x86.exe)|Windows (Win8+) standalone x86 (32-bit) binary
[yt-dlp_arm64.exe](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_arm64.exe)|Windows (Win10+) standalone arm64 (64-bit) binary
[yt-dlp_linux](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_linux)|Linux standalone x64 binary
[yt-dlp_linux_armv7l](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_linux_armv7l)|Linux standalone armv7l (32-bit) binary
[yt-dlp_linux_aarch64](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_linux_aarch64)|Linux standalone aarch64 (64-bit) binary
[yt-dlp_win.zip](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_win.zip)|Unpackaged Windows executable (no auto-update)
[yt-dlp_win.zip](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_win.zip)|Unpackaged Windows (Win8+) x64 executable (no auto-update)
[yt-dlp_win_x86.zip](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_win_x86.zip)|Unpackaged Windows (Win8+) x86 executable (no auto-update)
[yt-dlp_win_arm64.zip](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_win_arm64.zip)|Unpackaged Windows (Win10+) arm64 executable (no auto-update)
[yt-dlp_macos.zip](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos.zip)|Unpackaged MacOS (10.15+) executable (no auto-update)
[yt-dlp_macos_legacy](https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_macos_legacy)|MacOS (10.9+) standalone x64 executable
#### Misc
@@ -171,8 +173,11 @@ yt-dlp --update-to nightly
python3 -m pip install -U --pre "yt-dlp[default]"
```
When running a yt-dlp version that is older than 90 days, you will see a warning message suggesting to update to the latest version.
You can suppress this warning by adding `--no-update` to your command or configuration file.
## DEPENDENCIES
Python versions 3.9+ (CPython) and 3.10+ (PyPy) are supported. Other versions and implementations may or may not work correctly.
Python versions 3.9+ (CPython) and 3.11+ (PyPy) are supported. Other versions and implementations may or may not work correctly.
<!-- Python 3.5+ uses VC++14 and it is already embedded in the binary created
<!x-- https://www.microsoft.com/en-us/download/details.aspx?id=26999 --x>
@@ -208,7 +213,7 @@ The following provide support for impersonating browser requests. This may be re
* [**mutagen**](https://github.com/quodlibet/mutagen)\* - For `--embed-thumbnail` in certain formats. Licensed under [GPLv2+](https://github.com/quodlibet/mutagen/blob/master/COPYING)
* [**AtomicParsley**](https://github.com/wez/atomicparsley) - For `--embed-thumbnail` in `mp4`/`m4a` files when `mutagen`/`ffmpeg` cannot. Licensed under [GPLv2+](https://github.com/wez/atomicparsley/blob/master/COPYING)
* [**xattr**](https://github.com/xattr/xattr), [**pyxattr**](https://github.com/iustin/pyxattr) or [**setfattr**](http://savannah.nongnu.org/projects/attr) - For writing xattr metadata (`--xattr`) on **Mac** and **BSD**. Licensed under [MIT](https://github.com/xattr/xattr/blob/master/LICENSE.txt), [LGPL2.1](https://github.com/iustin/pyxattr/blob/master/COPYING) and [GPLv2+](http://git.savannah.nongnu.org/cgit/attr.git/tree/doc/COPYING) respectively
* [**xattr**](https://github.com/xattr/xattr), [**pyxattr**](https://github.com/iustin/pyxattr) or [**setfattr**](http://savannah.nongnu.org/projects/attr) - For writing xattr metadata (`--xattrs`) on **Mac** and **BSD**. Licensed under [MIT](https://github.com/xattr/xattr/blob/master/LICENSE.txt), [LGPL2.1](https://github.com/iustin/pyxattr/blob/master/COPYING) and [GPLv2+](http://git.savannah.nongnu.org/cgit/attr.git/tree/doc/COPYING) respectively
### Misc
@@ -1801,7 +1806,7 @@ The following extractors use this feature:
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player), `initial_data` (skip initial data/next ep request). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause issues such as missing formats or metadata. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) and [#12826](https://github.com/yt-dlp/yt-dlp/issues/12826) for more details
* `webpage_skip`: Skip extraction of embedded webpage data. One or both of `player_response`, `initial_data`. These options are for testing purposes and don't skip any network requests
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
* `player_js_variant`: The player javascript variant to use for signature and nsig deciphering. The known variants are: `main`, `tce`, `tv`, `tv_es6`, `phone`, `tablet`. Only `main` is recommended as a possible workaround; the others are for debugging purposes. The default is to use what is prescribed by the site, and can be selected with `actual`
* `player_js_variant`: The player javascript variant to use for signature and nsig deciphering. The known variants are: `main`, `tce`, `tv`, `tv_es6`, `phone`, `tablet`. The default is `main`, and the others are for debugging purposes. You can use `actual` to go with what is prescribed by the site
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all`
* E.g. `all,all,1000,10` will get a maximum of 1000 replies total, with up to 10 replies per thread. `1000,all,100` will get a maximum of 1000 comments, with a maximum of 100 replies total
@@ -1814,6 +1819,7 @@ The following extractors use this feature:
* `po_token`: Proof of Origin (PO) Token(s) to use. Comma seperated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be any of `gvs` (Google Video Server URLs), `player` (Innertube player request) or `subs` (Subtitles)
* `pot_trace`: Enable debug logging for PO Token fetching. Either `true` or `false` (default)
* `fetch_pot`: Policy to use for fetching a PO Token from providers. One of `always` (always try fetch a PO Token regardless if the client requires one for the given context), `never` (never fetch a PO Token), or `auto` (default; only fetch a PO Token if the client requires one for the given context)
* `playback_wait`: Duration (in seconds) to wait inbetween the extraction and download stages in order to ensure the formats are available. The default is `6` seconds
#### youtubepot-webpo
* `bind_to_visitor_id`: Whether to use the Visitor ID instead of Visitor Data for caching WebPO tokens. Either `true` (default) or `false`
@@ -1902,7 +1908,7 @@ The following extractors use this feature:
* `backend`: Backend API to use for extraction - one of `streaks` (default) or `brightcove` (deprecated)
#### vimeo
* `client`: Client to extract video data from. The currently available clients are `android`, `ios`, and `web`. Only one client can be used. The `android` client is used by default. If account cookies or credentials are used for authentication, then the `web` client is used by default. The `web` client only works with authentication. The `ios` client only works with previously cached OAuth tokens
* `client`: Client to extract video data from. The currently available clients are `android`, `ios`, and `web`. Only one client can be used. The `web` client is used by default. The `web` client only works with account cookies or login credentials. The `android` and `ios` clients only work with previously cached OAuth tokens
* `original_format_policy`: Policy for when to try extracting original formats. One of `always`, `never`, or `auto`. The default `auto` policy tries to avoid exceeding the web client's API rate-limit by only making an extra request when Vimeo publicizes the video's downloadability
**Note**: These options may be changed/removed in the future without concern for backward compatibility
@@ -2367,7 +2373,6 @@ These are aliases that are no longer documented for various reasons
--dump-headers --print-traffic
--dump-intermediate-pages --dump-pages
--force-write-download-archive --force-write-archive
--load-info --load-info-json
--no-clean-infojson --no-clean-info-json
--no-split-tracks --no-split-chapters
--no-write-srt --no-write-subs

View File

@@ -62,16 +62,22 @@ def parse_options():
def exe(onedir):
"""@returns (name, path)"""
platform_name, machine, extension = {
'win32': (None, MACHINE, '.exe'),
'darwin': ('macos', None, None),
}.get(OS_NAME, (OS_NAME, MACHINE, None))
name = '_'.join(filter(None, (
'yt-dlp',
{'win32': '', 'darwin': 'macos'}.get(OS_NAME, OS_NAME),
MACHINE,
platform_name,
machine,
)))
return name, ''.join(filter(None, (
'dist/',
onedir and f'{name}/',
name,
OS_NAME == 'win32' and '.exe',
extension,
)))

View File

@@ -6,7 +6,7 @@ __yt_dlp()
prev="${COMP_WORDS[COMP_CWORD-1]}"
opts="{{flags}}"
keywords=":ytfavorites :ytrecommended :ytsubscriptions :ytwatchlater :ythistory"
fileopts="-a|--batch-file|--download-archive|--cookies|--load-info"
fileopts="-a|--batch-file|--download-archive|--cookies|--load-info-json"
diropts="--cache-dir"
if [[ ${prev} =~ ${fileopts} ]]; then

View File

@@ -272,5 +272,26 @@
"action": "add",
"when": "959ac99e98c3215437e573c22d64be42d361e863",
"short": "[priority] Security: [[CVE-2025-54072](https://nvd.nist.gov/vuln/detail/CVE-2025-54072)] [Fix `--exec` placeholder expansion on Windows](https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-45hg-7f49-5h56)\n - When `--exec` is used on Windows, the filepath expanded from `{}` (or the default placeholder) is now properly escaped"
},
{
"action": "change",
"when": "b831406a1d3be34c159835079d12bae624c43610",
"short": "[ie/rtve.es:program] Add extractor (#12955)",
"authors": ["meGAmeS1", "seproDev"]
},
{
"action": "add",
"when": "23c658b9cbe34a151f8f921ab1320bb5d4e40a4d",
"short": "[priority] **The minimum *recommended* Python version has been raised to 3.10**\nSince Python 3.9 will reach end-of-life in October 2025, support for it will be dropped soon. [Read more](https://github.com/yt-dlp/yt-dlp/issues/13858)"
},
{
"action": "add",
"when": "cc5a5caac5fbc0d605b52bde0778d6fd5f97b5ab",
"short": "[priority] **darwin_legacy_exe builds are being discontinued**\nThis release's `yt-dlp_macos_legacy` binary will likely be the last one. [Read more](https://github.com/yt-dlp/yt-dlp/issues/13856)"
},
{
"action": "add",
"when": "c76ce28e06c816eb5b261dfb6aff6e69dd9b7382",
"short": "[priority] **linux_armv7l_exe builds are being discontinued**\nThis release's `yt-dlp_linux_armv7l` binary could be the last one. [Read more](https://github.com/yt-dlp/yt-dlp/issues/13976)"
}
]

View File

@@ -20,6 +20,7 @@ def parse_patched_options(opts):
'fragment_retries': 0,
'extract_flat': False,
'concat_playlist': 'never',
'update_self': False,
})
yt_dlp.options.create_parser = lambda: patched_parser
try:

View File

@@ -15,11 +15,11 @@ description = "A feature-rich command-line audio/video downloader"
readme = "README.md"
requires-python = ">=3.9"
keywords = [
"cli",
"downloader",
"youtube-dl",
"video-downloader",
"youtube-downloader",
"sponsorblock",
"youtube-dlc",
"yt-dlp",
]
license = {file = "LICENSE"}
@@ -51,11 +51,11 @@ default = [
"mutagen",
"pycryptodomex",
"requests>=2.32.2,<3",
"urllib3>=1.26.17,<3",
"urllib3>=2.0.2,<3",
"websockets>=13.0",
]
curl-cffi = [
"curl-cffi>=0.5.10,!=0.6.*,!=0.7.*,!=0.8.*,!=0.9.*,<0.11; implementation_name=='cpython'",
"curl-cffi>=0.5.10,!=0.6.*,!=0.7.*,!=0.8.*,!=0.9.*,<0.14; implementation_name=='cpython'",
]
secretstorage = [
"cffi",
@@ -315,6 +315,7 @@ banned-from = [
"yt_dlp.utils.error_to_compat_str".msg = "Use `str` instead."
"yt_dlp.utils.bytes_to_intlist".msg = "Use `list` instead."
"yt_dlp.utils.intlist_to_bytes".msg = "Use `bytes` instead."
"yt_dlp.utils.jwt_encode_hs256".msg = "Use `yt_dlp.utils.jwt_encode` instead."
"yt_dlp.utils.decodeArgument".msg = "Do not use"
"yt_dlp.utils.decodeFilename".msg = "Do not use"
"yt_dlp.utils.encodeFilename".msg = "Do not use"

View File

@@ -16,7 +16,7 @@ remove-unused-variables = true
[tox:tox]
skipsdist = true
envlist = py{39,310,311,312,313},pypy310
envlist = py{39,310,311,312,313},pypy311
skip_missing_interpreters = true
[testenv] # tox

View File

@@ -12,7 +12,7 @@ The only reliable way to check if a site is supported is to try it.
- **17live:vod**
- **1News**: 1news.co.nz article videos
- **1tv**: Первый канал
- **20min**
- **20min**: (**Currently broken**)
- **23video**
- **247sports**: (**Currently broken**)
- **24tv.ua**
@@ -45,10 +45,6 @@ The only reliable way to check if a site is supported is to try it.
- **ADNSeason**: [*animationdigitalnetwork*](## "netrc machine") Animation Digital Network
- **AdobeConnect**
- **adobetv**
- **adobetv:channel**
- **adobetv:embed**
- **adobetv:show**
- **adobetv:video**
- **AdultSwim**
- **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network and History Vault
- **aenetworks:collection**
@@ -100,7 +96,6 @@ The only reliable way to check if a site is supported is to try it.
- **ARD**
- **ARDMediathek**
- **ARDMediathekCollection**
- **Arkena**
- **Art19**
- **Art19Show**
- **arte.sky.it**
@@ -155,9 +150,8 @@ The only reliable way to check if a site is supported is to try it.
- **Beatport**
- **Beeg**
- **BehindKink**: (**Currently broken**)
- **Bellator**
- **BerufeTV**
- **Bet**: (**Currently broken**)
- **Bet**
- **bfi:player**: (**Currently broken**)
- **bfmtv**
- **bfmtv:article**
@@ -285,18 +279,15 @@ The only reliable way to check if a site is supported is to try it.
- **Clipchamp**
- **Clippit**
- **ClipRs**: (**Currently broken**)
- **ClipYouEmbed**
- **CloserToTruth**: (**Currently broken**)
- **CloudflareStream**
- **CloudyCDN**
- **Clubic**: (**Currently broken**)
- **Clyp**
- **cmt.com**: (**Currently broken**)
- **CNBCVideo**
- **CNN**
- **CNNIndonesia**
- **ComedyCentral**
- **ComedyCentralTV**
- **ConanClassic**: (**Currently broken**)
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
- **CONtv**
@@ -396,7 +387,6 @@ The only reliable way to check if a site is supported is to try it.
- **dw:article**: (**Currently broken**)
- **dzen.ru**: Дзен (dzen) formerly Яндекс.Дзен (Yandex Zen)
- **dzen.ru:channel**
- **EaglePlatform**
- **EbaumsWorld**
- **Ebay**
- **egghead:course**: egghead.io course
@@ -447,6 +437,8 @@ The only reliable way to check if a site is supported is to try it.
- **fancode:live**: [*fancode*](## "netrc machine") (**Currently broken**)
- **fancode:vod**: [*fancode*](## "netrc machine") (**Currently broken**)
- **Fathom**
- **Faulio**
- **FaulioLive**
- **faz.net**
- **fc2**: [*fc2*](## "netrc machine")
- **fc2:embed**
@@ -701,8 +693,8 @@ The only reliable way to check if a site is supported is to try it.
- **lbry:channel**: odysee.com channels
- **lbry:playlist**: odysee.com playlists
- **LCI**
- **Lcp**
- **LcpPlay**
- **Lcp**: (**Currently broken**)
- **LcpPlay**: (**Currently broken**)
- **Le**: 乐视网
- **LearningOnScreen**
- **Lecture2Go**: (**Currently broken**)
@@ -728,7 +720,7 @@ The only reliable way to check if a site is supported is to try it.
- **Liputan6**
- **ListenNotes**
- **LiTV**
- **LiveJournal**
- **LiveJournal**: (**Currently broken**)
- **livestream**
- **livestream:original**
- **Livestreamfails**
@@ -841,12 +833,6 @@ The only reliable way to check if a site is supported is to try it.
- **MSN**
- **mtg**: MTG services
- **mtv**
- **mtv.de**: (**Currently broken**)
- **mtv.it**
- **mtv.it:programma**
- **mtv:video**
- **mtvjapan**
- **mtvservices:embedded**
- **MTVUutisetArticle**: (**Currently broken**)
- **MuenchenTV**: münchen.tv (**Currently broken**)
- **MujRozhlas**
@@ -946,9 +932,6 @@ The only reliable way to check if a site is supported is to try it.
- **NhkVodProgram**
- **nhl.com**
- **nick.com**
- **nick.de**
- **nickelodeon:br**
- **nickelodeonru**
- **niconico**: [*niconico*](## "netrc machine") ニコニコ動画
- **niconico:history**: NicoNico user history or likes. Requires cookies.
- **niconico:live**: [*niconico*](## "netrc machine") ニコニコ生放送
@@ -1050,13 +1033,12 @@ The only reliable way to check if a site is supported is to try it.
- **Panopto**
- **PanoptoList**
- **PanoptoPlaylist**
- **ParamountNetwork**
- **ParamountPlus**
- **ParamountPlusSeries**
- **ParamountPressExpress**
- **Parler**: Posts on parler.com
- **parliamentlive.tv**: UK parliament videos
- **Parlview**: (**Currently broken**)
- **Parlview**
- **parti:livestream**
- **parti:video**
- **patreon**
@@ -1089,7 +1071,6 @@ The only reliable way to check if a site is supported is to try it.
- **PiramideTVChannel**
- **pixiv:sketch**
- **pixiv:sketch:user**
- **Pladform**
- **PlanetMarathi**
- **Platzi**: [*platzi*](## "netrc machine")
- **PlatziCourse**: [*platzi*](## "netrc machine")
@@ -1105,6 +1086,7 @@ The only reliable way to check if a site is supported is to try it.
- **pluralsight:course**
- **PlutoTV**: (**Currently broken**)
- **PlVideo**: Платформа
- **PlyrEmbed**
- **PodbayFM**
- **PodbayFMChannel**
- **Podchaser**
@@ -1258,6 +1240,7 @@ The only reliable way to check if a site is supported is to try it.
- **rtve.es:alacarta**: RTVE a la carta and Play
- **rtve.es:audio**: RTVE audio
- **rtve.es:live**: RTVE.es live streams
- **rtve.es:program**: RTVE.es programs
- **rtve.es:television**
- **rtvslo.si**
- **rtvslo.si:show**
@@ -1275,7 +1258,7 @@ The only reliable way to check if a site is supported is to try it.
- **rutube:playlist**: Rutube playlists
- **rutube:tags**: Rutube tags
- **RUTV**: RUTV.RU
- **Ruutu**
- **Ruutu**: (**Currently broken**)
- **Ruv**
- **ruv.is:spila**
- **S4C**
@@ -1326,6 +1309,7 @@ The only reliable way to check if a site is supported is to try it.
- **SharePoint**
- **ShareVideosEmbed**
- **ShemarooMe**
- **Shiey**
- **ShowRoomLive**
- **ShugiinItvLive**: 衆議院インターネット審議中継
- **ShugiinItvLiveRoom**: 衆議院インターネット審議中継 (中継)
@@ -1375,15 +1359,16 @@ The only reliable way to check if a site is supported is to try it.
- **southpark.cc.com:español**
- **southpark.de**
- **southpark.lat**
- **southpark.nl**
- **southparkstudios.dk**
- **southparkstudios.co.uk**
- **southparkstudios.com.br**
- **southparkstudios.nu**
- **SovietsCloset**
- **SovietsClosetPlaylist**
- **SpankBang**
- **SpankBangPlaylist**
- **Spiegel**
- **Sport5**
- **SportBox**
- **SportBox**: (**Currently broken**)
- **SportDeutschland**
- **spotify**: Spotify episodes (**Currently broken**)
- **spotify:show**: Spotify shows (**Currently broken**)
@@ -1524,7 +1509,6 @@ The only reliable way to check if a site is supported is to try it.
- **TrueID**
- **TruNews**
- **Truth**
- **TruTV**
- **Tube8**: (**Currently broken**)
- **TubeTuGraz**: [*tubetugraz*](## "netrc machine") tube.tugraz.at
- **TubeTuGrazSeries**: [*tubetugraz*](## "netrc machine")
@@ -1556,7 +1540,6 @@ The only reliable way to check if a site is supported is to try it.
- **TVer**
- **tvigle**: Интернет-телевидение Tvigle.ru
- **TVIPlayer**
- **tvland.com**
- **TVN24**: (**Currently broken**)
- **TVNoe**: (**Currently broken**)
- **tvopengr:embed**: tvopen.gr embedded videos
@@ -1569,6 +1552,7 @@ The only reliable way to check if a site is supported is to try it.
- **TVPlayer**
- **TVPlayHome**
- **tvw**
- **tvw:news**
- **tvw:tvchannels**
- **Tweakers**
- **TwitCasting**
@@ -1616,15 +1600,13 @@ The only reliable way to check if a site is supported is to try it.
- **Vbox7**
- **Veo**
- **Vesti**: Вести.Ru (**Currently broken**)
- **Vevo**
- **VevoPlaylist**
- **VGTV**: VGTV, BTTV, FTV, Aftenposten and Aftonbladet
- **vh1.com**
- **vhx:embed**: [*vimeo*](## "netrc machine")
- **vice**: (**Currently broken**)
- **vice:article**: (**Currently broken**)
- **vice:show**: (**Currently broken**)
- **Viddler**
- **Viddler**: (**Currently broken**)
- **Videa**
- **video.arnes.si**: Arnes Video
- **video.google:search**: Google Video search; "gvsearch:" prefix
@@ -1696,7 +1678,7 @@ The only reliable way to check if a site is supported is to try it.
- **vrsquare:section**
- **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza
- **vrtmax**: [*vrtnu*](## "netrc machine") VRT MAX (formerly VRT NU)
- **VTM**: (**Currently broken**)
- **VTM**
- **VTV**
- **VTVGo**
- **VTXTV**: [*vtxtv*](## "netrc machine")

View File

@@ -21,9 +21,6 @@ class TestCompat(unittest.TestCase):
with self.assertWarns(DeprecationWarning):
_ = compat.compat_basestring
with self.assertWarns(DeprecationWarning):
_ = compat.WINDOWS_VT_MODE
self.assertEqual(urllib.request.getproxies, getproxies)
with self.assertWarns(DeprecationWarning):

View File

@@ -14,7 +14,6 @@ from yt_dlp.extractor import (
NRKTVIE,
PBSIE,
CeskaTelevizeIE,
ComedyCentralIE,
DailymotionIE,
DemocracynowIE,
LyndaIE,
@@ -279,23 +278,6 @@ class TestNPOSubtitles(BaseTestSubtitles):
self.assertEqual(md5(subtitles['nl']), 'fc6435027572b63fb4ab143abd5ad3f4')
@is_download_test
@unittest.skip('IE broken')
class TestMTVSubtitles(BaseTestSubtitles):
url = 'http://www.cc.com/video-clips/p63lk0/adam-devine-s-house-party-chasing-white-swans'
IE = ComedyCentralIE
def getInfoDict(self):
return super().getInfoDict()['entries'][0]
def test_allsubtitles(self):
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), {'en'})
self.assertEqual(md5(subtitles['en']), '78206b8d8a0cfa9da64dc026eea48961')
@is_download_test
class TestNRKSubtitles(BaseTestSubtitles):
url = 'http://tv.nrk.no/serie/ikke-gjoer-dette-hjemme/DMPV73000411/sesong-2/episode-1'

View File

@@ -84,8 +84,9 @@ lock 2023.11.16 (?!win_x86_exe).+ Python 3\.7
lock 2023.11.16 win_x86_exe .+ Windows-(?:Vista|2008Server)
lock 2024.10.22 py2exe .+
lock 2024.10.22 linux_(?:armv7l|aarch64)_exe .+-glibc2\.(?:[12]?\d|30)\b
lock 2024.10.22 (?!\w+_exe).+ Python 3\.8
lock 2024.10.22 zip Python 3\.8
lock 2024.10.22 win(?:_x86)?_exe Python 3\.[78].+ Windows-(?:7-|2008ServerR2)
lock 2025.08.11 darwin_legacy_exe .+
'''
TEST_LOCKFILE_V2_TMPL = r'''%s
@@ -94,20 +95,23 @@ lockV2 yt-dlp/yt-dlp 2023.11.16 (?!win_x86_exe).+ Python 3\.7
lockV2 yt-dlp/yt-dlp 2023.11.16 win_x86_exe .+ Windows-(?:Vista|2008Server)
lockV2 yt-dlp/yt-dlp 2024.10.22 py2exe .+
lockV2 yt-dlp/yt-dlp 2024.10.22 linux_(?:armv7l|aarch64)_exe .+-glibc2\.(?:[12]?\d|30)\b
lockV2 yt-dlp/yt-dlp 2024.10.22 (?!\w+_exe).+ Python 3\.8
lockV2 yt-dlp/yt-dlp 2024.10.22 zip Python 3\.8
lockV2 yt-dlp/yt-dlp 2024.10.22 win(?:_x86)?_exe Python 3\.[78].+ Windows-(?:7-|2008ServerR2)
lockV2 yt-dlp/yt-dlp 2025.08.11 darwin_legacy_exe .+
lockV2 yt-dlp/yt-dlp-nightly-builds 2023.11.15.232826 (?!win_x86_exe).+ Python 3\.7
lockV2 yt-dlp/yt-dlp-nightly-builds 2023.11.15.232826 win_x86_exe .+ Windows-(?:Vista|2008Server)
lockV2 yt-dlp/yt-dlp-nightly-builds 2024.10.22.051025 py2exe .+
lockV2 yt-dlp/yt-dlp-nightly-builds 2024.10.22.051025 linux_(?:armv7l|aarch64)_exe .+-glibc2\.(?:[12]?\d|30)\b
lockV2 yt-dlp/yt-dlp-nightly-builds 2024.10.22.051025 (?!\w+_exe).+ Python 3\.8
lockV2 yt-dlp/yt-dlp-nightly-builds 2024.10.22.051025 zip Python 3\.8
lockV2 yt-dlp/yt-dlp-nightly-builds 2024.10.22.051025 win(?:_x86)?_exe Python 3\.[78].+ Windows-(?:7-|2008ServerR2)
lockV2 yt-dlp/yt-dlp-nightly-builds 2025.08.12.233030 darwin_legacy_exe .+
lockV2 yt-dlp/yt-dlp-master-builds 2023.11.15.232812 (?!win_x86_exe).+ Python 3\.7
lockV2 yt-dlp/yt-dlp-master-builds 2023.11.15.232812 win_x86_exe .+ Windows-(?:Vista|2008Server)
lockV2 yt-dlp/yt-dlp-master-builds 2024.10.22.045052 py2exe .+
lockV2 yt-dlp/yt-dlp-master-builds 2024.10.22.060347 linux_(?:armv7l|aarch64)_exe .+-glibc2\.(?:[12]?\d|30)\b
lockV2 yt-dlp/yt-dlp-master-builds 2024.10.22.060347 (?!\w+_exe).+ Python 3\.8
lockV2 yt-dlp/yt-dlp-master-builds 2024.10.22.060347 zip Python 3\.8
lockV2 yt-dlp/yt-dlp-master-builds 2024.10.22.060347 win(?:_x86)?_exe Python 3\.[78].+ Windows-(?:7-|2008ServerR2)
lockV2 yt-dlp/yt-dlp-master-builds 2025.08.12.232447 darwin_legacy_exe .+
'''
TEST_LOCKFILE_V2 = TEST_LOCKFILE_V2_TMPL % TEST_LOCKFILE_COMMENT
@@ -217,6 +221,10 @@ class TestUpdate(unittest.TestCase):
test( # linux_aarch64_exe w/glibc2.3 should only update to glibc<2.31 lock
lockfile, 'linux_aarch64_exe Python 3.8.0 (CPython aarch64 64bit) - Linux-6.5.0-1025-azure-aarch64-with-glibc2.3 (OpenSSL',
'2025.01.01', '2024.10.22')
test(lockfile, 'darwin_legacy_exe Python 3.10.5', '2025.08.11', '2025.08.11')
test(lockfile, 'darwin_legacy_exe Python 3.10.5', '2025.08.11', '2025.08.11', exact=True)
test(lockfile, 'darwin_legacy_exe Python 3.10.5', '2025.08.12', '2025.08.11')
test(lockfile, 'darwin_legacy_exe Python 3.10.5', '2025.08.12', None, exact=True)
# Forks can block updates to non-numeric tags rather than lock
test(TEST_LOCKFILE_FORK, 'zip Python 3.6.3', 'pr0000', None, repo='fork/yt-dlp')

View File

@@ -71,6 +71,8 @@ from yt_dlp.utils import (
iri_to_uri,
is_html,
js_to_json,
jwt_decode_hs256,
jwt_encode,
limit_length,
locked_file,
lowercase_escape,
@@ -1373,6 +1375,7 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_resolution('pre_1920x1080_post'), {'width': 1920, 'height': 1080})
self.assertEqual(parse_resolution('ep1x2'), {})
self.assertEqual(parse_resolution('1920, 1080'), {'width': 1920, 'height': 1080})
self.assertEqual(parse_resolution('1920w', lenient=True), {'width': 1920})
def test_parse_bitrate(self):
self.assertEqual(parse_bitrate(None), None)
@@ -2179,6 +2182,41 @@ Line 1
assert int_or_none(v=10) == 10, 'keyword passed positional should call function'
assert int_or_none(scale=0.1)(10) == 100, 'call after partial application should call the function'
_JWT_KEY = '12345678'
_JWT_HEADERS_1 = {'a': 'b'}
_JWT_HEADERS_2 = {'typ': 'JWT', 'alg': 'HS256'}
_JWT_HEADERS_3 = {'typ': 'JWT', 'alg': 'RS256'}
_JWT_HEADERS_4 = {'c': 'd', 'alg': 'ES256'}
_JWT_DECODED = {
'foo': 'bar',
'qux': 'baz',
}
_JWT_SIMPLE = 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJmb28iOiJiYXIiLCJxdXgiOiJiYXoifQ.fKojvTWqnjNTbsdoDTmYNc4tgYAG3h_SWRzM77iLH0U'
_JWT_WITH_EXTRA_HEADERS = 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCIsImEiOiJiIn0.eyJmb28iOiJiYXIiLCJxdXgiOiJiYXoifQ.Ia91-B77yasfYM7jsB6iVKLew-3rO6ITjNmjWUVXCvQ'
_JWT_WITH_REORDERED_HEADERS = 'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJmb28iOiJiYXIiLCJxdXgiOiJiYXoifQ.slg-7COta5VOfB36p3tqV4MGPV6TTA_ouGnD48UEVq4'
_JWT_WITH_REORDERED_HEADERS_AND_RS256_ALG = 'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJmb28iOiJiYXIiLCJxdXgiOiJiYXoifQ.XWp496oVgQnoits0OOocutdjxoaQwn4GUWWxUsKENPM'
_JWT_WITH_EXTRA_HEADERS_AND_ES256_ALG = 'eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCIsImMiOiJkIn0.eyJmb28iOiJiYXIiLCJxdXgiOiJiYXoifQ.oM_tc7IkfrwkoRh43rFFE1wOi3J3mQGwx7_lMyKQqDg'
def test_jwt_encode(self):
def test(expected, headers={}):
self.assertEqual(jwt_encode(self._JWT_DECODED, self._JWT_KEY, headers=headers), expected)
test(self._JWT_SIMPLE)
test(self._JWT_WITH_EXTRA_HEADERS, headers=self._JWT_HEADERS_1)
test(self._JWT_WITH_REORDERED_HEADERS, headers=self._JWT_HEADERS_2)
test(self._JWT_WITH_REORDERED_HEADERS_AND_RS256_ALG, headers=self._JWT_HEADERS_3)
test(self._JWT_WITH_EXTRA_HEADERS_AND_ES256_ALG, headers=self._JWT_HEADERS_4)
def test_jwt_decode_hs256(self):
def test(inp):
self.assertEqual(jwt_decode_hs256(inp), self._JWT_DECODED)
test(self._JWT_SIMPLE)
test(self._JWT_WITH_EXTRA_HEADERS)
test(self._JWT_WITH_REORDERED_HEADERS)
test(self._JWT_WITH_REORDERED_HEADERS_AND_RS256_ALG)
test(self._JWT_WITH_EXTRA_HEADERS_AND_ES256_ALG)
if __name__ == '__main__':
unittest.main()

View File

@@ -138,6 +138,16 @@ _SIG_TESTS = [
'gN7a-hudCuAuPH6fByOk1_GNXN0yNMHShjZXS2VOgsEItAJz0tipeavEOmNdYN-wUtcEqD3bCXjc0iyKfAyZxCBGgIARwsSdQfJ2CJtt',
'JC2JfQdSswRAIgGBCxZyAfKyi0cjXCb3DqEctUw-NYdNmOEvaepit0zJAtIEsgOV2SXZjhSHMNy0NXNG_1kOyBf6HPuAuCduh-a',
),
(
'https://www.youtube.com/s/player/010fbc8d/player_es5.vflset/en_US/base.js',
'gN7a-hudCuAuPH6fByOk1_GNXN0yNMHShjZXS2VOgsEItAJz0tipeavEOmNdYN-wUtcEqD3bCXjc0iyKfAyZxCBGgIARwsSdQfJ2CJtt',
'ttJC2JfQdSswRAIgGBCxZyAfKyi0cjXCb3DqEctUw-NYdNmOEvaepit2zJAsIEggOVaSXZjhSHMNy0NXNG_1kOyBf6HPuAuCduh-',
),
(
'https://www.youtube.com/s/player/010fbc8d/player_es6.vflset/en_US/base.js',
'gN7a-hudCuAuPH6fByOk1_GNXN0yNMHShjZXS2VOgsEItAJz0tipeavEOmNdYN-wUtcEqD3bCXjc0iyKfAyZxCBGgIARwsSdQfJ2CJtt',
'ttJC2JfQdSswRAIgGBCxZyAfKyi0cjXCb3DqEctUw-NYdNmOEvaepit2zJAsIEggOVaSXZjhSHMNy0NXNG_1kOyBf6HPuAuCduh-',
),
]
_NSIG_TESTS = [
@@ -377,6 +387,14 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/ef259203/player_ias_tce.vflset/en_US/base.js',
'rPqBC01nJpqhhi2iA2U', 'hY7dbiKFT51UIA',
),
(
'https://www.youtube.com/s/player/010fbc8d/player_es5.vflset/en_US/base.js',
'0hlOAlqjFszVvF4Z', 'R-H23bZGAsRFTg',
),
(
'https://www.youtube.com/s/player/010fbc8d/player_es6.vflset/en_US/base.js',
'0hlOAlqjFszVvF4Z', 'R-H23bZGAsRFTg',
),
]

View File

@@ -36,6 +36,7 @@ from .extractor.openload import PhantomJSwrapper
from .globals import (
IN_CLI,
LAZY_EXTRACTORS,
WINDOWS_VT_MODE,
plugin_ies,
plugin_ies_overrides,
plugin_pps,
@@ -72,6 +73,7 @@ from .postprocessor.ffmpeg import resolve_mapping as resolve_recode_mapping
from .update import (
REPOSITORY,
_get_system_deprecation,
_get_outdated_warning,
_make_label,
current_git_head,
detect_variant,
@@ -503,6 +505,7 @@ class YoutubeDL:
force_keyframes_at_cuts: Re-encode the video when downloading ranges to get precise cuts
noprogress: Do not print the progress bar
live_from_start: Whether to download livestreams videos from the start
warn_when_outdated: Emit a warning if the yt-dlp version is older than 90 days
The following parameters are not used by YoutubeDL itself, they are used by
the downloader (see yt_dlp/downloader/common.py):
@@ -596,7 +599,7 @@ class YoutubeDL:
_NUMERIC_FIELDS = {
'width', 'height', 'asr', 'audio_channels', 'fps',
'tbr', 'abr', 'vbr', 'filesize', 'filesize_approx',
'timestamp', 'release_timestamp',
'timestamp', 'release_timestamp', 'available_at',
'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
'average_rating', 'comment_count', 'age_limit',
'start_time', 'end_time',
@@ -606,13 +609,13 @@ class YoutubeDL:
_format_fields = {
# NB: Keep in sync with the docstring of extractor/common.py
'url', 'manifest_url', 'manifest_stream_number', 'ext', 'format', 'format_id', 'format_note',
'url', 'manifest_url', 'manifest_stream_number', 'ext', 'format', 'format_id', 'format_note', 'available_at',
'width', 'height', 'aspect_ratio', 'resolution', 'dynamic_range', 'tbr', 'abr', 'acodec', 'asr', 'audio_channels',
'vbr', 'fps', 'vcodec', 'container', 'filesize', 'filesize_approx', 'rows', 'columns', 'hls_media_playlist_data',
'player_url', 'protocol', 'fragment_base_url', 'fragments', 'is_from_start', 'is_dash_periods', 'request_data',
'preference', 'language', 'language_preference', 'quality', 'source_preference', 'cookies',
'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'extra_param_to_segment_url', 'extra_param_to_key_url',
'hls_aes', 'downloader_options', 'page_url', 'app', 'play_path', 'tc_url', 'flash_version',
'hls_aes', 'downloader_options', 'impersonate', 'page_url', 'app', 'play_path', 'tc_url', 'flash_version',
'rtmp_live', 'rtmp_conn', 'rtmp_protocol', 'rtmp_real_time',
}
_deprecated_multivalue_fields = {
@@ -702,6 +705,9 @@ class YoutubeDL:
system_deprecation = _get_system_deprecation()
if system_deprecation:
self.deprecated_feature(system_deprecation.replace('\n', '\n '))
elif self.params.get('warn_when_outdated'):
if outdated_warning := _get_outdated_warning():
self.report_warning(outdated_warning)
if self.params.get('allow_unplayable_formats'):
self.report_warning(
@@ -748,8 +754,6 @@ class YoutubeDL:
if self.params.get('geo_verification_proxy') is None:
self.params['geo_verification_proxy'] = self.params['cn_verification_proxy']
check_deprecated('autonumber', '--auto-number', '-o "%(autonumber)s-%(title)s.%(ext)s"')
check_deprecated('usetitle', '--title', '-o "%(title)s-%(id)s.%(ext)s"')
check_deprecated('useid', '--id', '-o "%(id)s.%(ext)s"')
for msg in self.params.get('_warnings', []):
@@ -4040,8 +4044,7 @@ class YoutubeDL:
if os.environ.get('TERM', '').lower() == 'dumb':
additional_info.append('dumb')
if not supports_terminal_sequences(stream):
from .utils import WINDOWS_VT_MODE # Must be imported locally
additional_info.append('No VT' if WINDOWS_VT_MODE is False else 'No ANSI')
additional_info.append('No VT' if WINDOWS_VT_MODE.value is False else 'No ANSI')
if additional_info:
ret = f'{ret} ({",".join(additional_info)})'
return ret

View File

@@ -500,6 +500,14 @@ def validate_options(opts):
'To let yt-dlp download and merge the best available formats, simply do not pass any format selection',
'If you know what you are doing and want only the best pre-merged format, use "-f b" instead to suppress this warning')))
# Common mistake: -f mp4
if opts.format == 'mp4':
warnings.append('.\n '.join((
'"-f mp4" selects the best pre-merged mp4 format which is often not what\'s intended',
'Pre-merged mp4 formats are not available from all sites, or may only be available in lower quality',
'To prioritize the best h264 video and aac audio in an mp4 container, use "-t mp4" instead',
'If you know what you are doing and want a pre-merged mp4 format, use "-f b[ext=mp4]" instead to suppress this warning')))
# --(postprocessor/downloader)-args without name
def report_args_compat(name, value, key1, key2=None, where=None):
if key1 in value and key2 not in value:
@@ -971,6 +979,7 @@ def parse_options(argv=None):
'geo_bypass': opts.geo_bypass,
'geo_bypass_country': opts.geo_bypass_country,
'geo_bypass_ip_block': opts.geo_bypass_ip_block,
'warn_when_outdated': opts.update_self is None,
'_warnings': warnings,
'_deprecation_warnings': deprecation_warnings,
'compat_opts': opts.compat_opts,
@@ -1030,6 +1039,7 @@ def _real_main(argv=None):
(ImpersonateTarget('safari'), 'curl_cffi'),
(ImpersonateTarget('firefox'), 'curl_cffi>=0.10'),
(ImpersonateTarget('edge'), 'curl_cffi'),
(ImpersonateTarget('tor'), 'curl_cffi>=0.11'),
]
available_targets = ydl._get_available_impersonate_targets()

View File

@@ -37,7 +37,7 @@ from ..dependencies import websockets as compat_websockets # noqa: F401
from ..dependencies.Cryptodome import AES as compat_pycrypto_AES # noqa: F401
from ..networking.exceptions import HTTPError as compat_HTTPError
passthrough_module(__name__, '...utils', ('WINDOWS_VT_MODE', 'windows_enable_vt_mode'))
passthrough_module(__name__, '...utils', ('windows_enable_vt_mode',))
# compat_ctypes_WINFUNCTYPE = ctypes.WINFUNCTYPE

View File

@@ -1335,7 +1335,7 @@ class YoutubeDLCookieJar(http.cookiejar.MozillaCookieJar):
if len(cookie_list) != self._ENTRY_LEN:
raise http.cookiejar.LoadError(f'invalid length {len(cookie_list)}')
cookie = self._CookieFileEntry(*cookie_list)
if cookie.expires_at and not cookie.expires_at.isdigit():
if cookie.expires_at and not re.fullmatch(r'[0-9]+(?:\.[0-9]+)?', cookie.expires_at):
raise http.cookiejar.LoadError(f'invalid expires at {cookie.expires_at}')
return line

View File

@@ -455,14 +455,26 @@ class FileDownloader:
self._finish_multiline_status()
return True, False
sleep_note = ''
if subtitle:
sleep_interval = self.params.get('sleep_interval_subtitles') or 0
else:
min_sleep_interval = self.params.get('sleep_interval') or 0
max_sleep_interval = self.params.get('max_sleep_interval') or 0
if available_at := info_dict.get('available_at'):
forced_sleep_interval = available_at - int(time.time())
if forced_sleep_interval > min_sleep_interval:
sleep_note = 'as required by the site'
min_sleep_interval = forced_sleep_interval
if forced_sleep_interval > max_sleep_interval:
max_sleep_interval = forced_sleep_interval
sleep_interval = random.uniform(
min_sleep_interval, self.params.get('max_sleep_interval') or min_sleep_interval)
min_sleep_interval, max_sleep_interval or min_sleep_interval)
if sleep_interval > 0:
self.to_screen(f'[download] Sleeping {sleep_interval:.2f} seconds ...')
self.to_screen(f'[download] Sleeping {sleep_interval:.2f} seconds {sleep_note}...')
time.sleep(sleep_interval)
ret = self.real_download(filename, info_dict)

View File

@@ -3,7 +3,7 @@ import urllib.parse
from . import get_suitable_downloader
from .fragment import FragmentFD
from ..utils import update_url_query, urljoin
from ..utils import ReExtractInfo, update_url_query, urljoin
class DashSegmentsFD(FragmentFD):
@@ -28,6 +28,11 @@ class DashSegmentsFD(FragmentFD):
requested_formats = [{**info_dict, **fmt} for fmt in info_dict.get('requested_formats', [])]
args = []
for fmt in requested_formats or [info_dict]:
# Re-extract if --load-info-json is used and 'fragments' was originally a generator
# See https://github.com/yt-dlp/yt-dlp/issues/13906
if isinstance(fmt['fragments'], str):
raise ReExtractInfo('the stream needs to be re-extracted', expected=True)
try:
fragment_count = 1 if self.params.get('test') else len(fmt['fragments'])
except TypeError:

View File

@@ -572,7 +572,21 @@ class FFmpegFD(ExternalFD):
if end_time:
args += ['-t', str(end_time - start_time)]
args += [*self._configuration_args((f'_i{i + 1}', '_i')), '-i', fmt['url']]
url = fmt['url']
if self.params.get('enable_file_urls') and url.startswith('file:'):
# The default protocol_whitelist is 'file,crypto,data' when reading local m3u8 URLs,
# so only local segments can be read unless we also include 'http,https,tcp,tls'
args += ['-protocol_whitelist', 'file,crypto,data,http,https,tcp,tls']
# ffmpeg incorrectly handles 'file:' URLs by only removing the
# 'file:' prefix and treating the rest as if it's a normal filepath.
# FFmpegPostProcessor also depends on this behavior, so we need to fixup the URLs:
# - On Windows/Cygwin, replace 'file:///' and 'file://localhost/' with 'file:'
# - On *nix, replace 'file://localhost/' with 'file:/'
# Ref: https://github.com/yt-dlp/yt-dlp/issues/13781
# https://trac.ffmpeg.org/ticket/2702
url = re.sub(r'^file://(?:localhost)?/', 'file:' if os.name == 'nt' else 'file:/', url)
args += [*self._configuration_args((f'_i{i + 1}', '_i')), '-i', url]
if not (start_time or end_time) or not self.params.get('force_keyframes_at_cuts'):
args += ['-c', 'copy']

View File

@@ -205,7 +205,7 @@ class HlsFD(FragmentFD):
line = line.strip()
if line:
if not line.startswith('#'):
if format_index and discontinuity_count != format_index:
if format_index is not None and discontinuity_count != format_index:
continue
if ad_frag_next:
continue
@@ -231,7 +231,7 @@ class HlsFD(FragmentFD):
byte_range = {}
elif line.startswith('#EXT-X-MAP'):
if format_index and discontinuity_count != format_index:
if format_index is not None and discontinuity_count != format_index:
continue
if frag_index > 0:
self.report_error(

View File

@@ -58,13 +58,7 @@ from .adn import (
ADNSeasonIE,
)
from .adobeconnect import AdobeConnectIE
from .adobetv import (
AdobeTVChannelIE,
AdobeTVEmbedIE,
AdobeTVIE,
AdobeTVShowIE,
AdobeTVVideoIE,
)
from .adobetv import AdobeTVVideoIE
from .adultswim import AdultSwimIE
from .aenetworks import (
AENetworksCollectionIE,
@@ -152,7 +146,6 @@ from .ard import (
ARDBetaMediathekIE,
ARDMediathekCollectionIE,
)
from .arkena import ArkenaIE
from .arnes import ArnesIE
from .art19 import (
Art19IE,
@@ -405,16 +398,12 @@ from .cloudflarestream import CloudflareStreamIE
from .cloudycdn import CloudyCDNIE
from .clubic import ClubicIE
from .clyp import ClypIE
from .cmt import CMTIE
from .cnbc import CNBCVideoIE
from .cnn import (
CNNIE,
CNNIndonesiaIE,
)
from .comedycentral import (
ComedyCentralIE,
ComedyCentralTVIE,
)
from .comedycentral import ComedyCentralIE
from .commonmistakes import (
BlobIE,
CommonMistakesIE,
@@ -571,10 +560,6 @@ from .dw import (
DWIE,
DWArticleIE,
)
from .eagleplatform import (
ClipYouEmbedIE,
EaglePlatformIE,
)
from .ebaumsworld import EbaumsWorldIE
from .ebay import EbayIE
from .egghead import (
@@ -640,6 +625,10 @@ from .fancode import (
FancodeVodIE,
)
from .fathom import FathomIE
from .faulio import (
FaulioIE,
FaulioLiveIE,
)
from .faz import FazIE
from .fc2 import (
FC2IE,
@@ -1190,15 +1179,7 @@ from .moview import MoviewPlayIE
from .moviezine import MoviezineIE
from .movingimage import MovingImageIE
from .msn import MSNIE
from .mtv import (
MTVDEIE,
MTVIE,
MTVItaliaIE,
MTVItaliaProgrammaIE,
MTVJapanIE,
MTVServicesEmbeddedIE,
MTVVideoIE,
)
from .mtv import MTVIE
from .muenchentv import MuenchenTVIE
from .murrtube import (
MurrtubeIE,
@@ -1340,12 +1321,7 @@ from .nhk import (
NhkVodProgramIE,
)
from .nhl import NHLIE
from .nick import (
NickBrIE,
NickDeIE,
NickIE,
NickRuIE,
)
from .nick import NickIE
from .niconico import (
NiconicoHistoryIE,
NiconicoIE,
@@ -1551,7 +1527,6 @@ from .pixivsketch import (
PixivSketchIE,
PixivSketchUserIE,
)
from .pladform import PladformIE
from .planetmarathi import PlanetMarathiIE
from .platzi import (
PlatziCourseIE,
@@ -1568,6 +1543,7 @@ from .pluralsight import (
)
from .plutotv import PlutoTVIE
from .plvideo import PlVideoIE
from .plyr import PlyrEmbedIE
from .podbayfm import (
PodbayFMChannelIE,
PodbayFMIE,
@@ -1783,6 +1759,7 @@ from .rtve import (
RTVEALaCartaIE,
RTVEAudioIE,
RTVELiveIE,
RTVEProgramIE,
RTVETelevisionIE,
)
from .rtvs import RTVSIE
@@ -1867,6 +1844,7 @@ from .shahid import (
from .sharepoint import SharePointIE
from .sharevideos import ShareVideosEmbedIE
from .shemaroome import ShemarooMeIE
from .shiey import ShieyIE
from .showroomlive import ShowRoomLiveIE
from .sibnet import SibnetEmbedIE
from .simplecast import (
@@ -1931,12 +1909,13 @@ from .soundgasm import (
SoundgasmProfileIE,
)
from .southpark import (
SouthParkComBrIE,
SouthParkCoUkIE,
SouthParkDeIE,
SouthParkDkIE,
SouthParkEsIE,
SouthParkIE,
SouthParkLatIE,
SouthParkNlIE,
)
from .sovietscloset import (
SovietsClosetIE,
@@ -1947,10 +1926,6 @@ from .spankbang import (
SpankBangPlaylistIE,
)
from .spiegel import SpiegelIE
from .spike import (
BellatorIE,
ParamountNetworkIE,
)
from .sport5 import Sport5IE
from .sportbox import SportBoxIE
from .sportdeutschland import SportDeutschlandIE
@@ -2166,7 +2141,6 @@ from .trtworld import TrtWorldIE
from .trueid import TrueIDIE
from .trunews import TruNewsIE
from .truth import TruthIE
from .trutv import TruTVIE
from .tube8 import Tube8IE
from .tubetugraz import (
TubeTuGrazIE,
@@ -2216,7 +2190,6 @@ from .tvc import (
from .tver import TVerIE
from .tvigle import TvigleIE
from .tviplayer import TVIPlayerIE
from .tvland import TVLandIE
from .tvn24 import TVN24IE
from .tvnoe import TVNoeIE
from .tvopengr import (
@@ -2237,6 +2210,7 @@ from .tvplay import (
from .tvplayer import TVPlayerIE
from .tvw import (
TvwIE,
TvwNewsIE,
TvwTvChannelsIE,
)
from .tweakers import TweakersIE
@@ -2313,10 +2287,6 @@ from .varzesh3 import Varzesh3IE
from .vbox7 import Vbox7IE
from .veo import VeoIE
from .vesti import VestiIE
from .vevo import (
VevoIE,
VevoPlaylistIE,
)
from .vgtv import (
VGTVIE,
BTArticleIE,

View File

@@ -48,7 +48,6 @@ MSO_INFO = {
'username_field': 'user',
'password_field': 'passwd',
'login_hostname': 'login.xfinity.com',
'needs_newer_ua': True,
},
'TWC': {
'name': 'Time Warner Cable | Spectrum',
@@ -1379,11 +1378,8 @@ class AdobePassIE(InfoExtractor): # XXX: Conventionally, base classes should en
@staticmethod
def _get_mso_headers(mso_info):
# yt-dlp's default user-agent is usually too old for some MSO's like Comcast_SSO
# See: https://github.com/yt-dlp/yt-dlp/issues/10848
return {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:131.0) Gecko/20100101 Firefox/131.0',
} if mso_info.get('needs_newer_ua') else {}
# Not needed currently
return {}
@staticmethod
def _get_mvpd_resource(provider_id, title, guid, rating):

View File

@@ -1,285 +1,100 @@
import functools
import re
from .common import InfoExtractor
from ..utils import (
ISO639Utils,
OnDemandPagedList,
clean_html,
determine_ext,
float_or_none,
int_or_none,
join_nonempty,
parse_duration,
str_or_none,
str_to_int,
unified_strdate,
url_or_none,
)
from ..utils.traversal import traverse_obj
class AdobeTVBaseIE(InfoExtractor):
def _call_api(self, path, video_id, query, note=None):
return self._download_json(
'http://tv.adobe.com/api/v4/' + path,
video_id, note, query=query)['data']
def _parse_subtitles(self, video_data, url_key):
subtitles = {}
for translation in video_data.get('translations', []):
vtt_path = translation.get(url_key)
if not vtt_path:
continue
lang = translation.get('language_w3c') or ISO639Utils.long2short(translation['language_medium'])
subtitles.setdefault(lang, []).append({
'ext': 'vtt',
'url': vtt_path,
})
return subtitles
def _parse_video_data(self, video_data):
video_id = str(video_data['id'])
title = video_data['title']
s3_extracted = False
formats = []
for source in video_data.get('videos', []):
source_url = source.get('url')
if not source_url:
continue
f = {
'format_id': source.get('quality_level'),
'fps': int_or_none(source.get('frame_rate')),
'height': int_or_none(source.get('height')),
'tbr': int_or_none(source.get('video_data_rate')),
'width': int_or_none(source.get('width')),
'url': source_url,
}
original_filename = source.get('original_filename')
if original_filename:
if not (f.get('height') and f.get('width')):
mobj = re.search(r'_(\d+)x(\d+)', original_filename)
if mobj:
f.update({
'height': int(mobj.group(2)),
'width': int(mobj.group(1)),
})
if original_filename.startswith('s3://') and not s3_extracted:
formats.append({
'format_id': 'original',
'quality': 1,
'url': original_filename.replace('s3://', 'https://s3.amazonaws.com/'),
})
s3_extracted = True
formats.append(f)
return {
'id': video_id,
'title': title,
'description': video_data.get('description'),
'thumbnail': video_data.get('thumbnail'),
'upload_date': unified_strdate(video_data.get('start_date')),
'duration': parse_duration(video_data.get('duration')),
'view_count': str_to_int(video_data.get('playcount')),
'formats': formats,
'subtitles': self._parse_subtitles(video_data, 'vtt'),
}
class AdobeTVEmbedIE(AdobeTVBaseIE):
IE_NAME = 'adobetv:embed'
_VALID_URL = r'https?://tv\.adobe\.com/embed/\d+/(?P<id>\d+)'
_TEST = {
'url': 'https://tv.adobe.com/embed/22/4153',
'md5': 'c8c0461bf04d54574fc2b4d07ac6783a',
'info_dict': {
'id': '4153',
'ext': 'flv',
'title': 'Creating Graphics Optimized for BlackBerry',
'description': 'md5:eac6e8dced38bdaae51cd94447927459',
'thumbnail': r're:https?://.*\.jpg$',
'upload_date': '20091109',
'duration': 377,
'view_count': int,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._call_api(
'episode/' + video_id, video_id, {'disclosure': 'standard'})[0]
return self._parse_video_data(video_data)
class AdobeTVIE(AdobeTVBaseIE):
class AdobeTVVideoIE(InfoExtractor):
IE_NAME = 'adobetv'
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?watch/(?P<show_urlname>[^/]+)/(?P<id>[^/]+)'
_TEST = {
'url': 'http://tv.adobe.com/watch/the-complete-picture-with-julieanne-kost/quick-tip-how-to-draw-a-circle-around-an-object-in-photoshop/',
'md5': '9bc5727bcdd55251f35ad311ca74fa1e',
'info_dict': {
'id': '10981',
'ext': 'mp4',
'title': 'Quick Tip - How to Draw a Circle Around an Object in Photoshop',
'description': 'md5:99ec318dc909d7ba2a1f2b038f7d2311',
'thumbnail': r're:https?://.*\.jpg$',
'upload_date': '20110914',
'duration': 60,
'view_count': int,
},
}
def _real_extract(self, url):
language, show_urlname, urlname = self._match_valid_url(url).groups()
if not language:
language = 'en'
video_data = self._call_api(
'episode/get', urlname, {
'disclosure': 'standard',
'language': language,
'show_urlname': show_urlname,
'urlname': urlname,
})[0]
return self._parse_video_data(video_data)
class AdobeTVPlaylistBaseIE(AdobeTVBaseIE):
_PAGE_SIZE = 25
def _fetch_page(self, display_id, query, page):
page += 1
query['page'] = page
for element_data in self._call_api(
self._RESOURCE, display_id, query, f'Download Page {page}'):
yield self._process_data(element_data)
def _extract_playlist_entries(self, display_id, query):
return OnDemandPagedList(functools.partial(
self._fetch_page, display_id, query), self._PAGE_SIZE)
class AdobeTVShowIE(AdobeTVPlaylistBaseIE):
IE_NAME = 'adobetv:show'
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?show/(?P<id>[^/]+)'
_TEST = {
'url': 'http://tv.adobe.com/show/the-complete-picture-with-julieanne-kost',
'info_dict': {
'id': '36',
'title': 'The Complete Picture with Julieanne Kost',
'description': 'md5:fa50867102dcd1aa0ddf2ab039311b27',
},
'playlist_mincount': 136,
}
_RESOURCE = 'episode'
_process_data = AdobeTVBaseIE._parse_video_data
def _real_extract(self, url):
language, show_urlname = self._match_valid_url(url).groups()
if not language:
language = 'en'
query = {
'disclosure': 'standard',
'language': language,
'show_urlname': show_urlname,
}
show_data = self._call_api(
'show/get', show_urlname, query)[0]
return self.playlist_result(
self._extract_playlist_entries(show_urlname, query),
str_or_none(show_data.get('id')),
show_data.get('show_name'),
show_data.get('show_description'))
class AdobeTVChannelIE(AdobeTVPlaylistBaseIE):
IE_NAME = 'adobetv:channel'
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?channel/(?P<id>[^/]+)(?:/(?P<category_urlname>[^/]+))?'
_TEST = {
'url': 'http://tv.adobe.com/channel/development',
'info_dict': {
'id': 'development',
},
'playlist_mincount': 96,
}
_RESOURCE = 'show'
def _process_data(self, show_data):
return self.url_result(
show_data['url'], 'AdobeTVShow', str_or_none(show_data.get('id')))
def _real_extract(self, url):
language, channel_urlname, category_urlname = self._match_valid_url(url).groups()
if not language:
language = 'en'
query = {
'channel_urlname': channel_urlname,
'language': language,
}
if category_urlname:
query['category_urlname'] = category_urlname
return self.playlist_result(
self._extract_playlist_entries(channel_urlname, query),
channel_urlname)
class AdobeTVVideoIE(AdobeTVBaseIE):
IE_NAME = 'adobetv:video'
_VALID_URL = r'https?://video\.tv\.adobe\.com/v/(?P<id>\d+)'
_EMBED_REGEX = [r'<iframe[^>]+src=[\'"](?P<url>(?:https?:)?//video\.tv\.adobe\.com/v/\d+[^"]+)[\'"]']
_TEST = {
# From https://helpx.adobe.com/acrobat/how-to/new-experience-acrobat-dc.html?set=acrobat--get-started--essential-beginners
'url': 'https://video.tv.adobe.com/v/2456/',
_EMBED_REGEX = [r'<iframe[^>]+src=["\'](?P<url>(?:https?:)?//video\.tv\.adobe\.com/v/\d+)']
_TESTS = [{
'url': 'https://video.tv.adobe.com/v/2456',
'md5': '43662b577c018ad707a63766462b1e87',
'info_dict': {
'id': '2456',
'ext': 'mp4',
'title': 'New experience with Acrobat DC',
'description': 'New experience with Acrobat DC',
'duration': 248.667,
'duration': 248.522,
'thumbnail': r're:https?://images-tv\.adobe\.com/.+\.jpg',
},
}
}, {
'url': 'https://video.tv.adobe.com/v/3463980/adobe-acrobat',
'info_dict': {
'id': '3463980',
'ext': 'mp4',
'title': 'Adobe Acrobat: How to Customize the Toolbar for Faster PDF Editing',
'description': 'md5:94368ab95ae24f9c1bee0cb346e03dc3',
'duration': 97.514,
'thumbnail': r're:https?://images-tv\.adobe\.com/.+\.jpg',
},
}]
_WEBPAGE_TESTS = [{
# https://video.tv.adobe.com/v/3442499
'url': 'https://business.adobe.com/dx-fragments/summit/2025/marquees/S335/ondemand.live.html',
'info_dict': {
'id': '3442499',
'ext': 'mp4',
'title': 'S335 - Beyond Personalization: Creating Intent-Based Experiences at Scale',
'description': 'Beyond Personalization: Creating Intent-Based Experiences at Scale',
'duration': 2906.8,
'thumbnail': r're:https?://images-tv\.adobe\.com/.+\.jpg',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_data = self._parse_json(self._search_regex(
r'var\s+bridge\s*=\s*([^;]+);', webpage, 'bridged data'), video_id)
title = video_data['title']
video_data = self._search_json(
r'var\s+bridge\s*=', webpage, 'bridged data', video_id)
formats = []
sources = video_data.get('sources') or []
for source in sources:
source_src = source.get('src')
if not source_src:
continue
formats.append({
'filesize': int_or_none(source.get('kilobytes') or None, invscale=1000),
'format_id': join_nonempty(source.get('format'), source.get('label')),
'height': int_or_none(source.get('height') or None),
'tbr': int_or_none(source.get('bitrate') or None),
'width': int_or_none(source.get('width') or None),
'url': source_src,
})
for source in traverse_obj(video_data, (
'sources', lambda _, v: v['format'] != 'playlist' and url_or_none(v['src']),
)):
source_url = self._proto_relative_url(source['src'])
if determine_ext(source_url) == 'm3u8':
fmts = self._extract_m3u8_formats(
source_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
else:
fmts = [{'url': source_url}]
# For both metadata and downloaded files the duration varies among
# formats. I just pick the max one
duration = max(filter(None, [
float_or_none(source.get('duration'), scale=1000)
for source in sources]))
for fmt in fmts:
fmt.update(traverse_obj(source, {
'duration': ('duration', {float_or_none(scale=1000)}),
'filesize': ('kilobytes', {float_or_none(invscale=1000)}),
'format_id': (('format', 'label'), {str}, all, {lambda x: join_nonempty(*x)}),
'height': ('height', {int_or_none}),
'tbr': ('bitrate', {int_or_none}),
'width': ('width', {int_or_none}),
}))
formats.extend(fmts)
subtitles = {}
for translation in traverse_obj(video_data, (
'translations', lambda _, v: url_or_none(v['vttPath']),
)):
lang = translation.get('language_w3c') or ISO639Utils.long2short(translation.get('language_medium')) or 'und'
subtitles.setdefault(lang, []).append({
'ext': 'vtt',
'url': self._proto_relative_url(translation['vttPath']),
})
return {
'id': video_id,
'formats': formats,
'title': title,
'description': video_data.get('description'),
'thumbnail': video_data.get('video', {}).get('poster'),
'duration': duration,
'subtitles': self._parse_subtitles(video_data, 'vttPath'),
'subtitles': subtitles,
**traverse_obj(video_data, {
'title': ('title', {clean_html}),
'description': ('description', {clean_html}, filter),
'thumbnail': ('video', 'poster', {self._proto_relative_url}, {url_or_none}),
}),
}

View File

@@ -11,12 +11,11 @@ class APAIE(InfoExtractor):
_EMBED_REGEX = [r'<iframe[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//[^/]+\.apa\.at/embed/[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12}.*?)\1']
_TESTS = [{
'url': 'http://uvp.apa.at/embed/293f6d17-692a-44e3-9fd5-7b178f3a1029',
'md5': '2b12292faeb0a7d930c778c7a5b4759b',
'info_dict': {
'id': '293f6d17-692a-44e3-9fd5-7b178f3a1029',
'ext': 'mp4',
'title': '293f6d17-692a-44e3-9fd5-7b178f3a1029',
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:https?://kf-vn\.sf\.apa\.at/vn/.+\.jpg',
},
}, {
'url': 'https://uvp-apapublisher.sf.apa.at/embed/2f94e9e6-d945-4db2-9548-f9a41ebf7b78',
@@ -28,6 +27,15 @@ class APAIE(InfoExtractor):
'url': 'http://uvp-kleinezeitung.sf.apa.at/embed/f1c44979-dba2-4ebf-b021-e4cf2cac3c81',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.vol.at/blue-man-group/5593454',
'info_dict': {
'id': '293f6d17-692a-44e3-9fd5-7b178f3a1029',
'ext': 'mp4',
'title': '293f6d17-692a-44e3-9fd5-7b178f3a1029',
'thumbnail': r're:https?://kf-vn\.sf\.apa\.at/vn/.+\.jpg',
},
}]
def _real_extract(self, url):
mobj = self._match_valid_url(url)

View File

@@ -33,7 +33,6 @@ from ..utils import (
unified_timestamp,
url_or_none,
urlhandle_detect_ext,
variadic,
)
@@ -232,6 +231,23 @@ class ArchiveOrgIE(InfoExtractor):
'release_date': '19950402',
'timestamp': 1084927901,
},
}, {
# metadata['metadata']['description'] is a list of strings instead of str
'url': 'https://archive.org/details/pra-KZ1908.02',
'info_dict': {
'id': 'pra-KZ1908.02',
'ext': 'mp3',
'display_id': 'KZ1908.02_01.wav',
'title': 'Crips and Bloods speak about gang life',
'description': 'md5:2b56b35ff021311e3554b47a285e70b3',
'uploader': 'jake@archive.org',
'duration': 1733.74,
'track': 'KZ1908.02 01',
'track_number': 1,
'timestamp': 1336026026,
'upload_date': '20120503',
'release_year': 1992,
},
}]
@staticmethod
@@ -274,34 +290,40 @@ class ArchiveOrgIE(InfoExtractor):
m = metadata['metadata']
identifier = m['identifier']
info = {
info = traverse_obj(m, {
'title': ('title', {str}),
'description': ('description', ({str}, (..., all, {' '.join})), {clean_html}, filter, any),
'uploader': (('uploader', 'adder'), {str}, any),
'creators': ('creator', (None, ...), {str}, filter, all, filter),
'license': ('licenseurl', {url_or_none}),
'release_date': ('date', {unified_strdate}),
'timestamp': (('publicdate', 'addeddate'), {unified_timestamp}, any),
'location': ('venue', {str}),
'release_year': ('year', {int_or_none}),
})
info.update({
'id': identifier,
'title': m['title'],
'description': clean_html(m.get('description')),
'uploader': dict_get(m, ['uploader', 'adder']),
'creators': traverse_obj(m, ('creator', {variadic}, {lambda x: x[0] and list(x)})),
'license': m.get('licenseurl'),
'release_date': unified_strdate(m.get('date')),
'timestamp': unified_timestamp(dict_get(m, ['publicdate', 'addeddate'])),
'webpage_url': f'https://archive.org/details/{identifier}',
'location': m.get('venue'),
'release_year': int_or_none(m.get('year'))}
})
for f in metadata['files']:
if f['name'] in entries:
entries[f['name']] = merge_dicts(entries[f['name']], {
'id': identifier + '/' + f['name'],
'title': f.get('title') or f['name'],
'display_id': f['name'],
'description': clean_html(f.get('description')),
'creators': traverse_obj(f, ('creator', {variadic}, {lambda x: x[0] and list(x)})),
'duration': parse_duration(f.get('length')),
'track_number': int_or_none(f.get('track')),
'album': f.get('album'),
'discnumber': int_or_none(f.get('disc')),
'release_year': int_or_none(f.get('year'))})
**traverse_obj(f, {
'title': (('title', 'name'), {str}, any),
'display_id': ('name', {str}),
'description': ('description', ({str}, (..., all, {' '.join})), {clean_html}, filter, any),
'creators': ('creator', (None, ...), {str}, filter, all, filter),
'duration': ('length', {parse_duration}),
'track_number': ('track', {int_or_none}),
'album': ('album', {str}),
'discnumber': ('disc', {int_or_none}),
'release_year': ('year', {int_or_none}),
}),
})
entry = entries[f['name']]
elif traverse_obj(f, 'original', expected_type=str) in entries:
elif traverse_obj(f, ('original', {str})) in entries:
entry = entries[f['original']]
else:
continue

View File

@@ -62,6 +62,20 @@ class ArcPublishingIE(InfoExtractor):
'url': 'arcpublishing:tronc:460f2931-8130-4719-8ea1-ffcb2d7cb685',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.uppermichiganssource.com/2025/07/18/scattered-showers-storms-bring-heavy-rain-potential/',
'info_dict': {
'id': '508116f7-e999-48db-b7c2-60a04842679b',
'ext': 'mp4',
'title': 'Scattered showers & storms bring heavy rain potential',
'description': 'Scattered showers & storms bring heavy rain potential',
'duration': 2016,
'thumbnail': r're:https?://.+\.jpg',
'timestamp': 1752881287,
'upload_date': '20250718',
},
'expected_warnings': ['Ignoring subtitle tracks found in the HLS manifest'],
}]
_POWA_DEFAULTS = [
(['cmg', 'prisa'], '%s-config-prod.api.cdn.arcpublishing.com/video'),
([

View File

@@ -1,150 +0,0 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
parse_iso8601,
parse_qs,
try_get,
)
class ArkenaIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
video\.(?:arkena|qbrick)\.com/play2/embed/player\?|
play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)
)
'''
# See https://support.arkena.com/display/PLAY/Ways+to+embed+your+video
_EMBED_REGEX = [r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//play\.arkena\.com/embed/avp/.+?)\1']
_TESTS = [{
'url': 'https://video.qbrick.com/play2/embed/player?accountId=1034090&mediaId=d8ab4607-00090107-aab86310',
'md5': '97f117754e5f3c020f5f26da4a44ebaf',
'info_dict': {
'id': 'd8ab4607-00090107-aab86310',
'ext': 'mp4',
'title': 'EM_HT20_117_roslund_v2.mp4',
'timestamp': 1608285912,
'upload_date': '20201218',
'duration': 1429.162667,
'subtitles': {
'sv': 'count:3',
},
},
}, {
'url': 'https://play.arkena.com/embed/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411',
'only_matching': True,
}, {
'url': 'https://play.arkena.com/config/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411/?callbackMethod=jQuery1111023664739129262213_1469227693893',
'only_matching': True,
}, {
'url': 'http://play.arkena.com/config/avp/v1/player/media/327336/darkmatter/131064/?callbackMethod=jQuery1111002221189684892677_1469227595972',
'only_matching': True,
}, {
'url': 'http://play.arkena.com/embed/avp/v1/player/media/327336/darkmatter/131064/',
'only_matching': True,
}, {
'url': 'http://video.arkena.com/play2/embed/player?accountId=472718&mediaId=35763b3b-00090078-bf604299&pageStyling=styled',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = self._match_valid_url(url)
video_id = mobj.group('id')
account_id = mobj.group('account_id')
# Handle http://video.arkena.com/play2/embed/player URL
if not video_id:
qs = parse_qs(url)
video_id = qs.get('mediaId', [None])[0]
account_id = qs.get('accountId', [None])[0]
if not video_id or not account_id:
raise ExtractorError('Invalid URL', expected=True)
media = self._download_json(
f'https://video.qbrick.com/api/v1/public/accounts/{account_id}/medias/{video_id}',
video_id, query={
# https://video.qbrick.com/docs/api/examples/library-api.html
'fields': 'asset/resources/*/renditions/*(height,id,language,links/*(href,mimeType),type,size,videos/*(audios/*(codec,sampleRate),bitrate,codec,duration,height,width),width),created,metadata/*(title,description),tags',
})
metadata = media.get('metadata') or {}
title = metadata['title']
duration = None
formats = []
thumbnails = []
subtitles = {}
for resource in media['asset']['resources']:
for rendition in (resource.get('renditions') or []):
rendition_type = rendition.get('type')
for i, link in enumerate(rendition.get('links') or []):
href = link.get('href')
if not href:
continue
if rendition_type == 'image':
thumbnails.append({
'filesize': int_or_none(rendition.get('size')),
'height': int_or_none(rendition.get('height')),
'id': rendition.get('id'),
'url': href,
'width': int_or_none(rendition.get('width')),
})
elif rendition_type == 'subtitle':
subtitles.setdefault(rendition.get('language') or 'en', []).append({
'url': href,
})
elif rendition_type == 'video':
f = {
'filesize': int_or_none(rendition.get('size')),
'format_id': rendition.get('id'),
'url': href,
}
video = try_get(rendition, lambda x: x['videos'][i], dict)
if video:
if not duration:
duration = float_or_none(video.get('duration'))
f.update({
'height': int_or_none(video.get('height')),
'tbr': int_or_none(video.get('bitrate'), 1000),
'vcodec': video.get('codec'),
'width': int_or_none(video.get('width')),
})
audio = try_get(video, lambda x: x['audios'][0], dict)
if audio:
f.update({
'acodec': audio.get('codec'),
'asr': int_or_none(audio.get('sampleRate')),
})
formats.append(f)
elif rendition_type == 'index':
mime_type = link.get('mimeType')
if mime_type == 'application/smil+xml':
formats.extend(self._extract_smil_formats(
href, video_id, fatal=False))
elif mime_type == 'application/x-mpegURL':
formats.extend(self._extract_m3u8_formats(
href, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
elif mime_type == 'application/hds+xml':
formats.extend(self._extract_f4m_formats(
href, video_id, f4m_id='hds', fatal=False))
elif mime_type == 'application/dash+xml':
formats.extend(self._extract_mpd_formats(
href, video_id, mpd_id='dash', fatal=False))
elif mime_type == 'application/vnd.ms-sstr+xml':
formats.extend(self._extract_ism_formats(
href, video_id, ism_id='mss', fatal=False))
return {
'id': video_id,
'title': title,
'description': metadata.get('description'),
'timestamp': parse_iso8601(media.get('created')),
'thumbnails': thumbnails,
'subtitles': subtitles,
'duration': duration,
'tags': media.get('tags'),
'formats': formats,
}

View File

@@ -51,8 +51,8 @@ class ArteTVIE(ArteTVBaseIE):
'id': '109067-000-A',
'ext': 'mp4',
'description': 'md5:d2ca367b8ecee028dddaa8bd1aebc739',
'thumbnail': r're:https?://api-cdn\.arte\.tv/img/v2/image/.+',
'timestamp': 1713927600,
'thumbnail': 'https://api-cdn.arte.tv/img/v2/image/3rR6PLzfbigSkkeHtkCZNF/940x530',
'duration': 7599,
'title': 'La loi de Téhéran',
'upload_date': '20240424',
@@ -62,6 +62,7 @@ class ArteTVIE(ArteTVBaseIE):
'fr-forced': 'mincount:1',
},
},
'skip': 'Invalid URL',
}, {
'note': 'age-restricted',
'url': 'https://www.arte.tv/de/videos/006785-000-A/the-element-of-crime/',
@@ -69,9 +70,9 @@ class ArteTVIE(ArteTVBaseIE):
'id': '006785-000-A',
'description': 'md5:c2f94fdfefc8a280e4dab68ab96ab0ba',
'title': 'The Element of Crime',
'thumbnail': r're:https?://api-cdn\.arte\.tv/img/v2/image/.+',
'timestamp': 1696111200,
'duration': 5849,
'thumbnail': 'https://api-cdn.arte.tv/img/v2/image/q82dTTfyuCXupPsGxXsd7B/940x530',
'upload_date': '20230930',
'ext': 'mp4',
},
@@ -252,6 +253,30 @@ class ArteTVEmbedIE(InfoExtractor):
'url': 'https://www.arte.tv/player/v3/index.php?json_url=https://api.arte.tv/api/player/v2/config/de/100605-013-A',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
# FIXME: Embed detection
'url': 'https://timesofmalta.com/article/watch-sunken-warships-north-sea-arte.1108358',
'info_dict': {
'id': '110288-000-A',
'ext': 'mp4',
'title': 'Danger on the Seabed',
'alt_title': 'Sunken Warships in the North Sea',
'description': 'md5:a2c84cbad37d280bddb6484087120add',
'duration': 3148,
'thumbnail': r're:https?://api-cdn\.arte\.tv/img/v2/image/.+',
'timestamp': 1741686820,
'upload_date': '20250311',
},
'params': {'skip_download': 'm3u8'},
}, {
# FIXME: Embed detection
'url': 'https://www.eurockeennes.fr/en-live/',
'info_dict': {
'id': 'en-live',
'title': 'Les Eurocks en live | Les Eurockéennes de Belfort 3-4-5-6 juillet 2025 sur la Presqu&#039;Île du Malsaucy',
},
'playlist_count': 4,
}]
def _real_extract(self, url):
qs = parse_qs(url)
@@ -304,9 +329,9 @@ class ArteTVCategoryIE(ArteTVBaseIE):
'info_dict': {
'id': 'politics-and-society',
'title': 'Politics and society',
'description': 'Investigative documentary series, geopolitical analysis, and international commentary',
'description': 'Watch documentaries and reportage about politics, society and current affairs.',
},
'playlist_mincount': 13,
'playlist_mincount': 3,
}]
@classmethod

View File

@@ -4,7 +4,7 @@ from .common import InfoExtractor
from ..utils import (
ExtractorError,
float_or_none,
jwt_encode_hs256,
jwt_encode,
try_get,
)
@@ -83,11 +83,10 @@ class ATVAtIE(InfoExtractor):
'nbf': int(not_before.timestamp()),
'exp': int(expire.timestamp()),
}
jwt_token = jwt_encode_hs256(payload, self._ENCRYPTION_KEY, headers={'kid': self._ACCESS_ID})
videos = self._download_json(
'https://vas-v4.p7s1video.net/4.0/getsources',
content_id, 'Downloading videos JSON', query={
'token': jwt_token.decode('utf-8'),
'token': jwt_encode(payload, self._ENCRYPTION_KEY, headers={'kid': self._ACCESS_ID}),
})
video_id, videos_data = next(iter(videos['data'].items()))

View File

@@ -36,14 +36,12 @@ class BandcampIE(InfoExtractor):
'duration': 9.8485,
'uploader': 'youtube-dl "\'/\\ä↭',
'upload_date': '20121129',
'thumbnail': r're:https?://f4\.bcbits\.com/img/.+\.jpg',
'timestamp': 1354224127,
'track': 'youtube-dl "\'/\\ä↭ - youtube-dl test song "\'/\\ä↭',
'album_artist': 'youtube-dl "\'/\\ä↭',
'track_id': '1812978515',
'artist': 'youtube-dl "\'/\\ä↭',
'uploader_url': 'https://youtube-dl.bandcamp.com',
'uploader_id': 'youtube-dl',
'thumbnail': 'https://f4.bcbits.com/img/a3216802731_5.jpg',
'artists': ['youtube-dl "\'/\\ä↭'],
'album_artists': ['youtube-dl "\'/\\ä↭'],
},
@@ -54,10 +52,9 @@ class BandcampIE(InfoExtractor):
'info_dict': {
'id': '2650410135',
'ext': 'm4a',
'acodec': r're:[fa]lac',
'title': 'Ben Prunty - Lanius (Battle)',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Ben Prunty',
'thumbnail': r're:https?://f4\.bcbits\.com/img/.+\.jpg',
'timestamp': 1396508491,
'upload_date': '20140403',
'release_timestamp': 1396483200,
@@ -66,8 +63,6 @@ class BandcampIE(InfoExtractor):
'track': 'Lanius (Battle)',
'track_number': 1,
'track_id': '2650410135',
'artist': 'Ben Prunty',
'album_artist': 'Ben Prunty',
'album': 'FTL: Advanced Edition Soundtrack',
'uploader_url': 'https://benprunty.bandcamp.com',
'uploader_id': 'benprunty',
@@ -83,8 +78,8 @@ class BandcampIE(InfoExtractor):
'id': '2584466013',
'ext': 'mp3',
'title': 'Mastodon - Hail to Fire',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Mastodon',
'thumbnail': r're:https?://f4\.bcbits\.com/img/.+\.jpg',
'timestamp': 1322005399,
'upload_date': '20111122',
'release_timestamp': 1076112000,
@@ -93,8 +88,6 @@ class BandcampIE(InfoExtractor):
'track': 'Hail to Fire',
'track_number': 5,
'track_id': '2584466013',
'artist': 'Mastodon',
'album_artist': 'Mastodon',
'album': 'Call of the Mastodon',
'uploader_url': 'https://relapsealumni.bandcamp.com',
'uploader_id': 'relapsealumni',
@@ -110,8 +103,8 @@ class BandcampIE(InfoExtractor):
'id': '1978174799',
'ext': 'mp3',
'title': 'submerse - submerse - Safehouse',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'submerse',
'thumbnail': r're:https?://f4\.bcbits\.com/img/.+\.jpg',
'timestamp': 1480779297,
'upload_date': '20161203',
'release_timestamp': 1481068800,
@@ -120,8 +113,6 @@ class BandcampIE(InfoExtractor):
'track': 'submerse - Safehouse',
'track_number': 3,
'track_id': '1978174799',
'artist': 'submerse',
'album_artist': 'Diskotopia',
'album': 'DSK F/W 2016-2017 Free Compilation',
'uploader_url': 'https://diskotopia.bandcamp.com',
'uploader_id': 'diskotopia',
@@ -130,6 +121,30 @@ class BandcampIE(InfoExtractor):
'album_artists': ['Diskotopia'],
},
}]
_WEBPAGE_TESTS = [{
# FIXME: Embed detection
'url': 'https://www.punknews.org/article/85809/stay-inside-super-sonic',
'info_dict': {
'id': '2475540375',
'ext': 'mp3',
'title': 'Stay Inside - Super Sonic',
'album': 'Lunger',
'album_artists': ['Stay Inside'],
'artists': ['Stay Inside'],
'duration': 166.157,
'release_date': '20251003',
'release_timestamp': 1759449600.0,
'thumbnail': r're:https?://f4\.bcbits\.com/img/.+\.jpg',
'timestamp': 1749473029.0,
'track': 'Super Sonic',
'track_id': '2475540375',
'track_number': 3,
'upload_date': '20250609',
'uploader': 'Stay Inside',
'uploader_id': 'stayinside',
'uploader_url': 'https://stayinside.bandcamp.com',
},
}]
def _extract_data_attr(self, webpage, video_id, attr='tralbum', fatal=True):
return self._parse_json(self._html_search_regex(
@@ -279,10 +294,10 @@ class BandcampAlbumIE(BandcampIE): # XXX: Do not subclass from concrete IE
'id': '1353101989',
'ext': 'mp3',
'title': 'Blazo - Intro',
'thumbnail': r're:https?://f4\.bcbits\.com/img/.+\.jpg',
'timestamp': 1311756226,
'upload_date': '20110727',
'uploader': 'Blazo',
'thumbnail': 'https://f4.bcbits.com/img/a1721150828_5.jpg',
'album_artists': ['Blazo'],
'uploader_url': 'https://blazo.bandcamp.com',
'release_date': '20110727',
@@ -302,6 +317,7 @@ class BandcampAlbumIE(BandcampIE): # XXX: Do not subclass from concrete IE
'id': '38097443',
'ext': 'mp3',
'title': 'Blazo - Kero One - Keep It Alive (Blazo remix)',
'thumbnail': r're:https?://f4\.bcbits\.com/img/.+\.jpg',
'timestamp': 1311757238,
'upload_date': '20110727',
'uploader': 'Blazo',
@@ -315,7 +331,6 @@ class BandcampAlbumIE(BandcampIE): # XXX: Do not subclass from concrete IE
'uploader_id': 'blazo',
'album_artists': ['Blazo'],
'artists': ['Blazo'],
'thumbnail': 'https://f4.bcbits.com/img/a1721150828_5.jpg',
'release_timestamp': 1311724800.0,
},
},

View File

@@ -1,79 +1,47 @@
from .mtv import MTVServicesInfoExtractor
from ..utils import unified_strdate
from .mtv import MTVServicesBaseIE
class BetIE(MTVServicesInfoExtractor):
_WORKING = False
_VALID_URL = r'https?://(?:www\.)?bet\.com/(?:[^/]+/)+(?P<id>.+?)\.html'
_TESTS = [
{
'url': 'http://www.bet.com/news/politics/2014/12/08/in-bet-exclusive-obama-talks-race-and-racism.html',
'info_dict': {
'id': '07e96bd3-8850-3051-b856-271b457f0ab8',
'display_id': 'in-bet-exclusive-obama-talks-race-and-racism',
'ext': 'flv',
'title': 'A Conversation With President Obama',
'description': 'President Obama urges persistence in confronting racism and bias.',
'duration': 1534,
'upload_date': '20141208',
'thumbnail': r're:(?i)^https?://.*\.jpg$',
'subtitles': {
'en': 'mincount:2',
},
},
'params': {
# rtmp download
'skip_download': True,
},
class BetIE(MTVServicesBaseIE):
_VALID_URL = r'https?://(?:www\.)?bet\.com/(?:video-clips|episodes)/(?P<id>[\da-z]{6})'
_TESTS = [{
'url': 'https://www.bet.com/video-clips/w9mk7v',
'info_dict': {
'id': '3022d121-d191-43fd-b5fb-b2c26f335497',
'ext': 'mp4',
'display_id': 'w9mk7v',
'title': 'New Normal',
'description': 'md5:d7898c124713b4646cecad9d16ff01f3',
'duration': 30.08,
'series': 'Tyler Perry\'s Sistas',
'season': 'Season 0',
'season_number': 0,
'episode': 'Episode 0',
'episode_number': 0,
'timestamp': 1755269073,
'upload_date': '20250815',
},
{
'url': 'http://www.bet.com/video/news/national/2014/justice-for-ferguson-a-community-reacts.html',
'info_dict': {
'id': '9f516bf1-7543-39c4-8076-dd441b459ba9',
'display_id': 'justice-for-ferguson-a-community-reacts',
'ext': 'flv',
'title': 'Justice for Ferguson: A Community Reacts',
'description': 'A BET News special.',
'duration': 1696,
'upload_date': '20141125',
'thumbnail': r're:(?i)^https?://.*\.jpg$',
'subtitles': {
'en': 'mincount:2',
},
},
'params': {
# rtmp download
'skip_download': True,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.bet.com/episodes/nmce72/tyler-perry-s-sistas-heavy-is-the-crown-season-9-ep-5',
'info_dict': {
'id': '6427562b-3029-11f0-b405-16fff45bc035',
'ext': 'mp4',
'display_id': 'nmce72',
'title': 'Heavy Is the Crown',
'description': 'md5:1ed345d3157a50572d2464afcc7a652a',
'channel': 'BET',
'duration': 2550.0,
'thumbnail': r're:https://images\.paramount\.tech/uri/mgid:arc:imageassetref',
'series': 'Tyler Perry\'s Sistas',
'season': 'Season 9',
'season_number': 9,
'episode': 'Episode 5',
'episode_number': 5,
'timestamp': 1755165600,
'upload_date': '20250814',
'release_timestamp': 1755129600,
'release_date': '20250814',
},
]
_FEED_URL = 'http://feeds.mtvnservices.com/od/feed/bet-mrss-player'
def _get_feed_query(self, uri):
return {
'uuid': uri,
}
def _extract_mgid(self, webpage):
return self._search_regex(r'data-uri="([^"]+)', webpage, 'mgid')
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
mgid = self._extract_mgid(webpage)
videos_info = self._get_videos_info(mgid)
info_dict = videos_info['entries'][0]
upload_date = unified_strdate(self._html_search_meta('date', webpage))
description = self._html_search_meta('description', webpage)
info_dict.update({
'display_id': display_id,
'description': description,
'upload_date': upload_date,
})
return info_dict
'params': {'skip_download': 'm3u8'},
'skip': 'Requires provider sign-in',
}]

View File

@@ -175,13 +175,6 @@ class BilibiliBaseIE(InfoExtractor):
else:
note = f'Downloading video formats for cid {cid}'
# TODO: remove this patch once utils.networking.random_user_agent() is updated, see #13735
# playurl requests carrying old UA will be rejected
headers = {
'User-Agent': f'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/{random.randint(118,138)}.0.0.0 Safari/537.36',
**(headers or {}),
}
return self._download_json(
'https://api.bilibili.com/x/player/wbi/playurl', bvid,
query=self._sign_wbi(params, bvid), headers=headers, note=note)['data']
@@ -311,7 +304,7 @@ class BilibiliBaseIE(InfoExtractor):
class BiliBiliIE(BilibiliBaseIE):
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/(?:video/|festival/[^/?#]+\?(?:[^#]*&)?bvid=)[aAbB][vV](?P<id>[^/?#&]+)'
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/(?:video/|festival/[^/?#]+\?(?:[^#]*&)?bvid=)(?P<prefix>[aAbB][vV])(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://www.bilibili.com/video/BV13x41117TL',
@@ -570,7 +563,7 @@ class BiliBiliIE(BilibiliBaseIE):
},
}],
}, {
'note': '301 redirect to bangumi link',
'note': 'redirect from bvid to bangumi link via redirect_url',
'url': 'https://www.bilibili.com/video/BV1TE411f7f1',
'info_dict': {
'id': '288525',
@@ -587,7 +580,27 @@ class BiliBiliIE(BilibiliBaseIE):
'duration': 1183.957,
'timestamp': 1571648124,
'upload_date': '20191021',
'thumbnail': r're:^https?://.*\.(jpg|jpeg|png)$',
'thumbnail': r're:https?://.*\.(jpg|jpeg|png)$',
},
}, {
'note': 'redirect from aid to bangumi link via redirect_url',
'url': 'https://www.bilibili.com/video/av114868162141203',
'info_dict': {
'id': '1933368',
'title': 'PV 引爆变革的起点',
'ext': 'mp4',
'duration': 63.139,
'series': '时光代理人',
'series_id': '5183',
'season': '第三季',
'season_number': 4,
'season_id': '105212',
'episode': '引爆变革的起点',
'episode_number': 1,
'episode_id': '1933368',
'timestamp': 1752849001,
'upload_date': '20250718',
'thumbnail': r're:https?://.*\.(jpg|jpeg|png)$',
},
}, {
'note': 'video has subtitles, which requires login',
@@ -643,7 +656,7 @@ class BiliBiliIE(BilibiliBaseIE):
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video_id, prefix = self._match_valid_url(url).group('id', 'prefix')
headers = self.geo_verification_headers()
webpage, urlh = self._download_webpage_handle(url, video_id, headers=headers)
if not self._match_valid_url(urlh.url):
@@ -651,7 +664,24 @@ class BiliBiliIE(BilibiliBaseIE):
headers['Referer'] = url
initial_state = self._search_json(r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', video_id)
initial_state = self._search_json(r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', video_id, default=None)
if not initial_state:
if self._search_json(r'\bwindow\._riskdata_\s*=', webpage, 'risk', video_id, default={}).get('v_voucher'):
raise ExtractorError('You have exceeded the rate limit. Try again later', expected=True)
query = {'platform': 'web'}
prefix = prefix.upper()
if prefix == 'BV':
query['bvid'] = prefix + video_id
elif prefix == 'AV':
query['aid'] = video_id
detail = self._download_json(
'https://api.bilibili.com/x/web-interface/wbi/view/detail', video_id,
note='Downloading redirection URL', errnote='Failed to download redirection URL',
query=self._sign_wbi(query, video_id), headers=headers)
new_url = traverse_obj(detail, ('data', 'View', 'redirect_url', {url_or_none}))
if new_url and BiliBiliBangumiIE.suitable(new_url):
return self.url_result(new_url, BiliBiliBangumiIE)
raise ExtractorError('Unable to extract initial state')
if traverse_obj(initial_state, ('error', 'trueCode')) == -403:
self.raise_login_required()

View File

@@ -19,8 +19,19 @@ class BloggerIE(InfoExtractor):
'id': 'BLOGGER-video-3c740e3a49197e16-796',
'title': 'BLOGGER-video-3c740e3a49197e16-796',
'ext': 'mp4',
'thumbnail': r're:^https?://.*',
'duration': 76.068,
'thumbnail': r're:https?://i9\.ytimg\.com/vi_blogger/.+',
},
}]
_WEBPAGE_TESTS = [{
'url': 'https://blog.tomeuvizoso.net/2019/01/a-panfrost-milestone.html',
'md5': 'f1bc19b6ea1b0fd1d81e84ca9ec467ac',
'info_dict': {
'id': 'BLOGGER-video-3c740e3a49197e16-12203',
'ext': 'mp4',
'title': 'BLOGGER-video-3c740e3a49197e16-12203',
'duration': 76.068,
'thumbnail': r're:https?://i9\.ytimg\.com/vi_blogger/.+',
},
}]

View File

@@ -19,18 +19,16 @@ class CloudflareStreamIE(InfoExtractor):
'id': '31c9291ab41fac05471db4e73aa11717',
'ext': 'mp4',
'title': '31c9291ab41fac05471db4e73aa11717',
'thumbnail': 'https://cloudflarestream.com/31c9291ab41fac05471db4e73aa11717/thumbnails/thumbnail.jpg',
},
'params': {
'skip_download': 'm3u8',
'thumbnail': r're:https?://cloudflarestream\.com/.+\.jpg',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://watch.cloudflarestream.com/embed/sdk-iframe-integration.fla9.latest.js?video=0e8e040aec776862e1d632a699edf59e',
'info_dict': {
'id': '0e8e040aec776862e1d632a699edf59e',
'ext': 'mp4',
'title': '0e8e040aec776862e1d632a699edf59e',
'thumbnail': 'https://cloudflarestream.com/0e8e040aec776862e1d632a699edf59e/thumbnails/thumbnail.jpg',
'thumbnail': r're:https?://cloudflarestream\.com/.+\.jpg',
},
}, {
'url': 'https://watch.cloudflarestream.com/9df17203414fd1db3e3ed74abbe936c1',
@@ -54,11 +52,21 @@ class CloudflareStreamIE(InfoExtractor):
'id': 'eaef9dea5159cf968be84241b5cedfe7',
'ext': 'mp4',
'title': 'eaef9dea5159cf968be84241b5cedfe7',
'thumbnail': 'https://cloudflarestream.com/eaef9dea5159cf968be84241b5cedfe7/thumbnails/thumbnail.jpg',
'thumbnail': r're:https?://cloudflarestream\.com/.+\.jpg',
},
'params': {
'extractor_args': {'generic': {'impersonate': ['chrome']}},
'skip_download': 'm3u8',
},
}, {
# FIXME: Embed detection
'url': 'https://www.cloudflare.com/developer-platform/products/cloudflare-stream/',
'info_dict': {
'id': 'e7bd2dd67e0f8860b4ae81e33a966049',
'ext': 'mp4',
'title': 'e7bd2dd67e0f8860b4ae81e33a966049',
'thumbnail': r're:https?://cloudflarestream\.com/.+\.jpg',
},
}]
def _real_extract(self, url):

View File

@@ -1,55 +0,0 @@
from .mtv import MTVIE
# TODO: Remove - Reason: Outdated Site
class CMTIE(MTVIE): # XXX: Do not subclass from concrete IE
_WORKING = False
IE_NAME = 'cmt.com'
_VALID_URL = r'https?://(?:www\.)?cmt\.com/(?:videos|shows|(?:full-)?episodes|video-clips)/(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://www.cmt.com/videos/garth-brooks/989124/the-call-featuring-trisha-yearwood.jhtml#artist=30061',
'md5': 'e6b7ef3c4c45bbfae88061799bbba6c2',
'info_dict': {
'id': '989124',
'ext': 'mp4',
'title': 'Garth Brooks - "The Call (featuring Trisha Yearwood)"',
'description': 'Blame It All On My Roots',
},
'skip': 'Video not available',
}, {
'url': 'http://www.cmt.com/videos/misc/1504699/still-the-king-ep-109-in-3-minutes.jhtml#id=1739908',
'md5': 'e61a801ca4a183a466c08bd98dccbb1c',
'info_dict': {
'id': '1504699',
'ext': 'mp4',
'title': 'Still The King Ep. 109 in 3 Minutes',
'description': 'Relive or catch up with Still The King by watching this recap of season 1, episode 9.',
'timestamp': 1469421000.0,
'upload_date': '20160725',
},
}, {
'url': 'http://www.cmt.com/shows/party-down-south/party-down-south-ep-407-gone-girl/1738172/playlist/#id=1738172',
'only_matching': True,
}, {
'url': 'http://www.cmt.com/full-episodes/537qb3/nashville-the-wayfaring-stranger-season-5-ep-501',
'only_matching': True,
}, {
'url': 'http://www.cmt.com/video-clips/t9e4ci/nashville-juliette-in-2-minutes',
'only_matching': True,
}]
def _extract_mgid(self, webpage, url):
mgid = self._search_regex(
r'MTVN\.VIDEO\.contentUri\s*=\s*([\'"])(?P<mgid>.+?)\1',
webpage, 'mgid', group='mgid', default=None)
if not mgid:
mgid = self._extract_triforce_mgid(webpage)
return mgid
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
mgid = self._extract_mgid(webpage, url)
return self.url_result(f'http://media.mtvnservices.com/embed/{mgid}')

View File

@@ -1,55 +1,27 @@
from .mtv import MTVServicesInfoExtractor
from .mtv import MTVServicesBaseIE
class ComedyCentralIE(MTVServicesInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?cc\.com/(?:episodes|video(?:-clips)?|collection-playlist|movies)/(?P<id>[0-9a-z]{6})'
_FEED_URL = 'http://comedycentral.com/feeds/mrss/'
class ComedyCentralIE(MTVServicesBaseIE):
_VALID_URL = r'https?://(?:www\.)?cc\.com/video-clips/(?P<id>[\da-z]{6})'
_TESTS = [{
'url': 'http://www.cc.com/video-clips/5ke9v2/the-daily-show-with-trevor-noah-doc-rivers-and-steve-ballmer---the-nba-player-strike',
'md5': 'b8acb347177c680ff18a292aa2166f80',
'url': 'https://www.cc.com/video-clips/wl12cx',
'info_dict': {
'id': '89ccc86e-1b02-4f83-b0c9-1d9592ecd025',
'id': 'dec6953e-80c8-43b3-96cd-05e9230e704d',
'ext': 'mp4',
'title': 'The Daily Show with Trevor Noah|August 28, 2020|25|25149|Doc Rivers and Steve Ballmer - The NBA Player Strike',
'description': 'md5:5334307c433892b85f4f5e5ac9ef7498',
'timestamp': 1598670000,
'upload_date': '20200829',
'display_id': 'wl12cx',
'title': 'Alison Brie and Dave Franco -"Together"- Extended Interview',
'description': 'md5:ec68e38d3282f863de9cde0ce5cd231c',
'duration': 516.76,
'thumbnail': r're:https://images\.paramount\.tech/uri/mgid:arc:imageassetref:',
'series': 'The Daily Show',
'season': 'Season 30',
'season_number': 30,
'episode': 'Episode 0',
'episode_number': 0,
'timestamp': 1753973314,
'upload_date': '20250731',
'release_timestamp': 1753977914,
'release_date': '20250731',
},
}, {
'url': 'http://www.cc.com/episodes/pnzzci/drawn-together--american-idol--parody-clip-show-season-3-ep-314',
'only_matching': True,
}, {
'url': 'https://www.cc.com/video/k3sdvm/the-daily-show-with-jon-stewart-exclusive-the-fourth-estate',
'only_matching': True,
}, {
'url': 'https://www.cc.com/collection-playlist/cosnej/stand-up-specials/t6vtjb',
'only_matching': True,
}, {
'url': 'https://www.cc.com/movies/tkp406/a-cluesterfuenke-christmas',
'only_matching': True,
'params': {'skip_download': 'm3u8'},
}]
class ComedyCentralTVIE(MTVServicesInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?comedycentral\.tv/folgen/(?P<id>[0-9a-z]{6})'
_TESTS = [{
'url': 'https://www.comedycentral.tv/folgen/pxdpec/josh-investigates-klimawandel-staffel-1-ep-1',
'info_dict': {
'id': '15907dc3-ec3c-11e8-a442-0e40cf2fc285',
'ext': 'mp4',
'title': 'Josh Investigates',
'description': 'Steht uns das Ende der Welt bevor?',
},
}]
_FEED_URL = 'http://feeds.mtvnservices.com/od/feed/intl-mrss-player-feed'
_GEO_COUNTRIES = ['DE']
def _get_feed_query(self, uri):
return {
'accountOverride': 'intl.mtvi.com',
'arcEp': 'web.cc.tv',
'ep': 'b9032c3a',
'imageEp': 'web.cc.tv',
'mgid': uri,
}

View File

@@ -243,7 +243,7 @@ class InfoExtractor:
* extra_param_to_segment_url A query string to append to each
fragment's URL, or to update each existing query string
with. If it is an HLS stream with an AES-128 decryption key,
the query paramaters will be passed to the key URI as well,
the query parameters will be passed to the key URI as well,
unless there is an `extra_param_to_key_url` given,
or unless an external key URI is provided via `hls_aes`.
Only applied by the native HLS/DASH downloaders.
@@ -263,6 +263,7 @@ class InfoExtractor:
* a string in the format of CLIENT[:OS]
* a list or a tuple of CLIENT[:OS] strings or ImpersonateTarget instances
* a boolean value; True means any impersonate target is sufficient
* available_at Unix timestamp of when a format will be available to download
* downloader_options A dictionary of downloader options
(For internal use only)
* http_chunk_size Chunk size for HTTP downloads
@@ -419,7 +420,7 @@ class InfoExtractor:
__post_extractor: A function to be called just before the metadata is
written to either disk, logger or console. The function
must return a dict which will be added to the info_dict.
This is usefull for additional information that is
This is useful for additional information that is
time-consuming to extract. Note that the fields thus
extracted will not be available to output template and
match_filter. So, only "comments" and "comment_count" are
@@ -1527,11 +1528,11 @@ class InfoExtractor:
r'>\s*(?:18\s+U(?:\.S\.C\.|SC)\s+)?(?:§+\s*)?2257\b',
]
age_limit = 0
age_limit = None
for marker in AGE_LIMIT_MARKERS:
mobj = re.search(marker, html)
if mobj:
age_limit = max(age_limit, int(traverse_obj(mobj, 1, default=18)))
age_limit = max(age_limit or 0, int(traverse_obj(mobj, 1, default=18)))
return age_limit
def _media_rating_search(self, html):
@@ -2968,7 +2969,7 @@ class InfoExtractor:
else:
codecs = parse_codecs(codec_str)
if content_type not in ('video', 'audio', 'text'):
if mime_type == 'image/jpeg':
if mime_type in ('image/avif', 'image/jpeg'):
content_type = mime_type
elif codecs.get('vcodec', 'none') != 'none':
content_type = 'video'
@@ -3028,14 +3029,14 @@ class InfoExtractor:
'manifest_url': mpd_url,
'filesize': filesize,
}
elif content_type == 'image/jpeg':
elif content_type in ('image/avif', 'image/jpeg'):
# See test case in VikiIE
# https://www.viki.com/videos/1175236v-choosing-spouse-by-lottery-episode-1
f = {
'format_id': format_id,
'ext': 'mhtml',
'manifest_url': mpd_url,
'format_note': 'DASH storyboards (jpeg)',
'format_note': f'DASH storyboards ({mimetype2ext(mime_type)})',
'acodec': 'none',
'vcodec': 'none',
}
@@ -3107,7 +3108,6 @@ class InfoExtractor:
else:
# $Number*$ or $Time$ in media template with S list available
# Example $Number*$: http://www.svtplay.se/klipp/9023742/stopptid-om-bjorn-borg
# Example $Time$: https://play.arkena.com/embed/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411
representation_ms_info['fragments'] = []
segment_time = 0
segment_d = None
@@ -3177,7 +3177,7 @@ class InfoExtractor:
'url': mpd_url or base_url,
'fragment_base_url': base_url,
'fragments': [],
'protocol': 'http_dash_segments' if mime_type != 'image/jpeg' else 'mhtml',
'protocol': 'mhtml' if mime_type in ('image/avif', 'image/jpeg') else 'http_dash_segments',
})
if 'initialization_url' in representation_ms_info:
initialization_url = representation_ms_info['initialization_url']
@@ -3192,7 +3192,7 @@ class InfoExtractor:
else:
# Assuming direct URL to unfragmented media.
f['url'] = base_url
if content_type in ('video', 'audio', 'image/jpeg'):
if content_type in ('video', 'audio', 'image/avif', 'image/jpeg'):
f['manifest_stream_number'] = stream_numbers[f['url']]
stream_numbers[f['url']] += 1
period_entry['formats'].append(f)

View File

@@ -96,6 +96,24 @@ class CondeNastIE(InfoExtractor):
'upload_date': '20150916',
'timestamp': 1442434920,
},
}, {
# FIXME: Subtitles
'url': 'https://www.vanityfair.com/video/watch/vf-quiz-show-squid-game-s3',
'info_dict': {
'id': '6862f999c1afbc5ff06b4803',
'ext': 'mp4',
'title': '\'Squid Game\' Cast Tests How Well They Know Each Other',
'categories': ['Arts & Culture', 'Hollywood'],
'description': 'md5:7a9c668a1fc87648e77da13842ec1534',
'duration': 955,
'season': 'Season 1',
'series': 'Quizzing Each Other',
'tags': 'count:2',
'thumbnail': r're:https?://dwgyu36up6iuz\.cloudfront\.net/.+\.jpg',
'timestamp': 1751341306,
'upload_date': '20250701',
'uploader': 'vanityfair',
},
}, {
'url': 'https://player.cnevids.com/inline/video/59138decb57ac36b83000005.js?target=js-cne-player',
'only_matching': True,

View File

@@ -8,7 +8,6 @@ from ..utils import (
class CrooksAndLiarsIE(InfoExtractor):
_VALID_URL = r'https?://embed\.crooksandliars\.com/(?:embed|v)/(?P<id>[A-Za-z0-9]+)'
_EMBED_REGEX = [r'<(?:iframe[^>]+src|param[^>]+value)=(["\'])(?P<url>(?:https?:)?//embed\.crooksandliars\.com/(?:embed|v)/.+?)\1']
_TESTS = [{
'url': 'https://embed.crooksandliars.com/embed/8RUoRhRi',
'info_dict': {
@@ -16,7 +15,7 @@ class CrooksAndLiarsIE(InfoExtractor):
'ext': 'mp4',
'title': 'Fox & Friends Says Protecting Atheists From Discrimination Is Anti-Christian!',
'description': 'md5:e1a46ad1650e3a5ec7196d432799127f',
'thumbnail': r're:^https?://.*\.jpg',
'thumbnail': r're:https?://crooksandliars\.com/files/.+',
'timestamp': 1428207000,
'upload_date': '20150405',
'uploader': 'Heather',
@@ -26,6 +25,20 @@ class CrooksAndLiarsIE(InfoExtractor):
'url': 'http://embed.crooksandliars.com/v/MTE3MjUtMzQ2MzA',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://crooksandliars.com/2015/04/fox-friends-says-protecting-atheists',
'info_dict': {
'id': '8RUoRhRi',
'ext': 'mp4',
'title': 'Fox & Friends Says Protecting Atheists From Discrimination Is Anti-Christian!',
'description': 'md5:e1a46ad1650e3a5ec7196d432799127f',
'duration': 236,
'thumbnail': r're:https?://crooksandliars\.com/files/.+',
'timestamp': 1428207000,
'upload_date': '20150405',
'uploader': 'Heather',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)

View File

@@ -19,11 +19,22 @@ class DailyMailIE(InfoExtractor):
'ext': 'mp4',
'title': 'The Mountain appears in sparkling water ad for \'Heavy Bubbles\'',
'description': 'md5:a93d74b6da172dd5dc4d973e0b766a84',
'thumbnail': r're:https?://i\.dailymail\.co\.uk/.+\.jpg',
},
}, {
'url': 'http://www.dailymail.co.uk/embed/video/1295863.html',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.daily-news.gr/lifestyle/%ce%b7-%cf%84%cf%81%ce%b1%ce%b3%ce%bf%cf%85%ce%b4%ce%af%cf%83%cf%84%cf%81%ce%b9%ce%b1-jessie-j-%ce%bc%ce%bf%ce%b9%cf%81%ce%ac%cf%83%cf%84%ce%b7%ce%ba%ce%b5-%cf%83%cf%85%ce%b3%ce%ba%ce%bb%ce%bf%ce%bd/',
'info_dict': {
'id': '3463585',
'ext': 'mp4',
'title': 'Jessie J reveals she has undergone surgery as she shares clips',
'description': 'md5:9fa9a25feca5b656b0b4a39c922fad1e',
'thumbnail': r're:https?://i\.dailymail\.co\.uk/.+\.jpg',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)

View File

@@ -119,13 +119,14 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
_EMBED_REGEX = [rf'(?ix)<(?:(?:embed|iframe)[^>]+?src=|input[^>]+id=[\'"]dmcloudUrlEmissionSelect[\'"][^>]+value=)["\'](?P<url>{_VALID_URL[5:]})']
_TESTS = [{
'url': 'http://www.dailymotion.com/video/x5kesuj_office-christmas-party-review-jason-bateman-olivia-munn-t-j-miller_news',
'md5': '074b95bdee76b9e3654137aee9c79dfe',
'info_dict': {
'id': 'x5kesuj',
'ext': 'mp4',
'title': 'Office Christmas Party Review Jason Bateman, Olivia Munn, T.J. Miller',
'description': 'Office Christmas Party Review - Jason Bateman, Olivia Munn, T.J. Miller',
'duration': 187,
'tags': 'count:5',
'thumbnail': r're:https?://s[12]\.dmcdn\.net/v/.+',
'timestamp': 1493651285,
'upload_date': '20170501',
'uploader': 'Deadline',
@@ -133,18 +134,17 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'age_limit': 0,
'view_count': int,
'like_count': int,
'tags': ['hollywood', 'celeb', 'celebrity', 'movies', 'red carpet'],
'thumbnail': r're:https://(?:s[12]\.)dmcdn\.net/v/K456B1cmt4ZcZ9KiM/x1080',
},
}, {
'url': 'https://geo.dailymotion.com/player.html?video=x89eyek&mute=true',
'md5': 'e2f9717c6604773f963f069ca53a07f8',
'info_dict': {
'id': 'x89eyek',
'ext': 'mp4',
'title': "En quête d'esprit du 27/03/2022",
'title': 'En quête d\'esprit du 27/03/2022',
'description': 'md5:66542b9f4df2eb23f314fc097488e553',
'duration': 2756,
'tags': 'count:1',
'thumbnail': r're:https?://s[12]\.dmcdn\.net/v/.+',
'timestamp': 1648383669,
'upload_date': '20220327',
'uploader': 'CNEWS',
@@ -152,8 +152,6 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'age_limit': 0,
'view_count': int,
'like_count': int,
'tags': ['en_quete_d_esprit'],
'thumbnail': r're:https://(?:s[12]\.)dmcdn\.net/v/Tncwi1clTH6StrxMP/x1080',
},
}, {
'url': 'https://www.dailymotion.com/video/x2iuewm_steam-machine-models-pricing-listed-on-steam-store-ign-news_videogames',
@@ -163,8 +161,8 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'ext': 'mp4',
'title': 'Steam Machine Models, Pricing Listed on Steam Store - IGN News',
'description': 'Several come bundled with the Steam Controller.',
'thumbnail': r're:^https?:.*\.(?:jpg|png)$',
'duration': 74,
'thumbnail': r're:https?://s[12]\.dmcdn\.net/v/.+',
'timestamp': 1425657362,
'upload_date': '20150306',
'uploader': 'IGN',
@@ -173,20 +171,6 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'view_count': int,
},
'skip': 'video gone',
}, {
# Vevo video
'url': 'http://www.dailymotion.com/video/x149uew_katy-perry-roar-official_musi',
'info_dict': {
'title': 'Roar (Official)',
'id': 'USUV71301934',
'ext': 'mp4',
'uploader': 'Katy Perry',
'upload_date': '20130905',
},
'params': {
'skip_download': True,
},
'skip': 'VEVO is only available in some countries',
}, {
# age-restricted video
'url': 'http://www.dailymotion.com/video/xyh2zz_leanna-decker-cyber-girl-of-the-year-desires-nude-playboy-plus_redband',
@@ -259,9 +243,9 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'uploader_id': 'x2vtgmm',
'age_limit': 0,
'tags': [],
'thumbnail': r're:https?://s[12]\.dmcdn\.net/v/.+',
'view_count': int,
'like_count': int,
'thumbnail': r're:https://\w+.dmcdn.net/v/WnEY61cmvMxt2Fi6d/x1080',
},
}, {
# https://geo.dailymotion.com/player/xf7zn.html?playlist=x7wdsj
@@ -276,18 +260,18 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'info_dict': {
'id': 'x8u4owg',
'ext': 'mp4',
'description': 'À bord du « véloto », lalternative à la voiture pour la campagne',
'like_count': int,
'uploader': 'Le Parisien',
'thumbnail': 'https://www.leparisien.fr/resizer/ho_GwveeYftNkLwg_cEta--5Bv4=/1200x675/cloudfront-eu-central-1.images.arcpublishing.com/leparisien/BFXJNEBN75EUNHGYJLORUC3TX4.jpg',
'upload_date': '20240309',
'view_count': int,
'tags': 'count:7',
'thumbnail': r're:https?://www\.leparisien\.fr/.+\.jpg',
'timestamp': 1709997866,
'age_limit': 0,
'uploader_id': 'x32f7b',
'title': 'VIDÉO. Le «\xa0véloto\xa0», la voiture à pédales qui aimerait se faire une place sur les routes',
'duration': 428.0,
'description': 'À bord du « véloto », lalternative à la voiture pour la campagne',
'tags': ['biclou', 'vélo', 'véloto', 'campagne', 'voiture', 'environnement', 'véhicules intermédiaires'],
},
}, {
# https://geo.dailymotion.com/player/xry80.html?video=x8vu47w
@@ -297,9 +281,9 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'ext': 'mp4',
'like_count': int,
'uploader': 'Metatube',
'thumbnail': r're:https://\w+.dmcdn.net/v/W1G_S1coGSFTfkTeR/x1080',
'upload_date': '20240326',
'view_count': int,
'thumbnail': r're:https?://s[12]\.dmcdn\.net/v/.+',
'timestamp': 1711496732,
'age_limit': 0,
'uploader_id': 'x2xpy74',
@@ -308,6 +292,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'description': 'Que lindura',
'tags': [],
},
'skip': 'Invalid URL',
}, {
# //geo.dailymotion.com/player/xysxq.html?video=k2Y4Mjp7krAF9iCuINM
'url': 'https://lcp.fr/programmes/avant-la-catastrophe-la-naissance-de-la-dictature-nazie-1933-1936-346819',
@@ -322,11 +307,30 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'like_count': int,
'age_limit': 0,
'duration': 3220,
'thumbnail': 'https://s1.dmcdn.net/v/Xvumk1djJBUZfjj2a/x1080',
'tags': [],
'thumbnail': r're:https?://s[12]\.dmcdn\.net/v/.+',
'timestamp': 1739919947,
'upload_date': '20250218',
},
'skip': 'Invalid URL',
}, {
'url': 'https://forum.ionicframework.com/t/ionic-2-jw-player-dailymotion-player/83248',
'info_dict': {
'id': 'xwr14q',
'ext': 'mp4',
'title': 'Macklemore & Ryan Lewis - Thrift Shop (feat. Wanz)',
'age_limit': 0,
'description': 'md5:47fbe168b5a6ddc4a205e20dd6c841b2',
'duration': 234,
'like_count': int,
'tags': 'count:5',
'thumbnail': r're:https?://s[12]\.dmcdn\.net/v/.+',
'timestamp': 1358177670,
'upload_date': '20130114',
'uploader': 'Macklemore Official',
'uploader_id': 'x19qlwr',
'view_count': int,
},
}]
_GEO_BYPASS = False
_COMMON_MEDIA_FIELDS = '''description
@@ -540,7 +544,7 @@ class DailymotionSearchIE(DailymotionPlaylistBaseIE):
'id': 'king of turtles',
'title': 'king of turtles',
},
'playlist_mincount': 90,
'playlist_mincount': 0,
}]
_SEARCH_QUERY = 'query SEARCH_QUERY( $query: String! $page: Int $limit: Int ) { search { videos( query: $query first: $limit page: $page ) { edges { node { xid } } } } } '
@@ -584,7 +588,7 @@ class DailymotionUserIE(DailymotionPlaylistBaseIE):
'info_dict': {
'id': 'nqtv',
},
'playlist_mincount': 152,
'playlist_mincount': 148,
}, {
'url': 'http://www.dailymotion.com/user/UnderProject',
'info_dict': {

View File

@@ -12,13 +12,13 @@ class DBTVIE(InfoExtractor):
'ext': 'mp4',
'title': 'Skulle teste ut fornøyelsespark, men kollegaen var bare opptatt av bikinikroppen',
'description': 'md5:49cc8370e7d66e8a2ef15c3b4631fd3f',
'thumbnail': r're:https?://.*\.jpg',
'thumbnail': r're:https?://.+\.jpg',
'upload_date': '20160916',
'duration': 69,
'uploader_id': 'UCk5pvsyZJoYJBd7_oFPTlRQ',
'uploader': 'Dagbladet',
},
'add_ie': ['Youtube'],
'skip': 'Invalid URL',
}, {
'url': 'https://www.dagbladet.no/video/embed/xlGmyIeN9Jo/?autoplay=false',
'only_matching': True,
@@ -26,6 +26,20 @@ class DBTVIE(InfoExtractor):
'url': 'https://www.dagbladet.no/video/truer-iran-bor-passe-dere/PalfB2Cw',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
# FIXME: Embed detection
'url': 'https://www.dagbladet.no/nyheter/rekordstort-russisk-angrep/83325693',
'info_dict': {
'id': '1HW7fYry',
'ext': 'mp4',
'title': 'Putin taler - så skjer dette',
'description': 'md5:3e8bacee33de861a9663d9a3fcc54e5e',
'display_id': 'putin-taler-sa-skjer-dette',
'thumbnail': r're:https?://cdn\.jwplayer\.com/v2/media/.+',
'timestamp': 1751043600,
'upload_date': '20250627',
},
}]
def _real_extract(self, url):
display_id, video_id = self._match_valid_url(url).groups()

View File

@@ -4,6 +4,7 @@ from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
determine_ext,
jwt_decode_hs256,
parse_codecs,
try_get,
@@ -222,11 +223,18 @@ class DigitalConcertHallIE(InfoExtractor):
raise
formats = []
for m3u8_url in traverse_obj(stream_info, ('channel', ..., 'stream', ..., 'url', {url_or_none})):
formats.extend(self._extract_m3u8_formats(m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
for fmt in formats:
if fmt.get('format_note') and fmt.get('vcodec') == 'none':
fmt.update(parse_codecs(fmt['format_note']))
for fmt_url in traverse_obj(stream_info, ('channel', ..., 'stream', ..., 'url', {url_or_none})):
ext = determine_ext(fmt_url)
if ext == 'm3u8':
fmts = self._extract_m3u8_formats(fmt_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
for fmt in fmts:
if fmt.get('format_note') and fmt.get('vcodec') == 'none':
fmt.update(parse_codecs(fmt['format_note']))
formats.extend(fmts)
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(fmt_url, video_id, mpd_id='dash', fatal=False))
else:
self.report_warning(f'Skipping unsupported format extension "{ext}"')
yield {
'id': video_id,

View File

@@ -90,10 +90,6 @@ class DisneyIE(InfoExtractor):
webpage, 'embed data'), video_id)
video_data = page_data['video']
for external in video_data.get('externals', []):
if external.get('source') == 'vevo':
return self.url_result('vevo:' + external['data_id'], 'Vevo')
video_id = video_data['id']
title = video_data['title']

View File

@@ -1,215 +0,0 @@
import functools
import re
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
int_or_none,
smuggle_url,
unsmuggle_url,
url_or_none,
)
class EaglePlatformIE(InfoExtractor):
_VALID_URL = r'''(?x)
(?:
eagleplatform:(?P<custom_host>[^/]+):|
https?://(?P<host>.+?\.media\.eagleplatform\.com)/index/player\?.*\brecord_id=
)
(?P<id>\d+)
'''
_EMBED_REGEX = [r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//.+?\.media\.eagleplatform\.com/index/player\?.+?)\1']
_TESTS = [{
# http://lenta.ru/news/2015/03/06/navalny/
'url': 'http://lentaru.media.eagleplatform.com/index/player?player=new&record_id=227304&player_template_id=5201',
# Not checking MD5 as sometimes the direct HTTP link results in 404 and HLS is used
'info_dict': {
'id': '227304',
'ext': 'mp4',
'title': 'Навальный вышел на свободу',
'description': 'md5:d97861ac9ae77377f3f20eaf9d04b4f5',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 87,
'view_count': int,
'age_limit': 0,
},
}, {
# http://muz-tv.ru/play/7129/
# http://media.clipyou.ru/index/player?record_id=12820&width=730&height=415&autoplay=true
'url': 'eagleplatform:media.clipyou.ru:12820',
'md5': '358597369cf8ba56675c1df15e7af624',
'info_dict': {
'id': '12820',
'ext': 'mp4',
'title': "'O Sole Mio",
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 216,
'view_count': int,
},
'skip': 'Georestricted',
}, {
# referrer protected video (https://tvrain.ru/lite/teleshow/kak_vse_nachinalos/namin-418921/)
'url': 'eagleplatform:tvrainru.media.eagleplatform.com:582306',
'only_matching': True,
}]
@classmethod
def _extract_embed_urls(cls, url, webpage):
add_referer = functools.partial(smuggle_url, data={'referrer': url})
res = tuple(super()._extract_embed_urls(url, webpage))
if res:
return map(add_referer, res)
PLAYER_JS_RE = r'''
<script[^>]+
src=(?P<qjs>["\'])(?:https?:)?//(?P<host>(?:(?!(?P=qjs)).)+\.media\.eagleplatform\.com)/player/player\.js(?P=qjs)
.+?
'''
# "Basic usage" embedding (see http://dultonmedia.github.io/eplayer/)
mobj = re.search(
rf'''(?xs)
{PLAYER_JS_RE}
<div[^>]+
class=(?P<qclass>["\'])eagleplayer(?P=qclass)[^>]+
data-id=["\'](?P<id>\d+)
''', webpage)
if mobj is not None:
return [add_referer('eagleplatform:{host}:{id}'.format(**mobj.groupdict()))]
# Generalization of "Javascript code usage", "Combined usage" and
# "Usage without attaching to DOM" embeddings (see
# http://dultonmedia.github.io/eplayer/)
mobj = re.search(
r'''(?xs)
%s
<script>
.+?
new\s+EaglePlayer\(
(?:[^,]+\s*,\s*)?
{
.+?
\bid\s*:\s*["\']?(?P<id>\d+)
.+?
}
\s*\)
.+?
</script>
''' % PLAYER_JS_RE, webpage) # noqa: UP031
if mobj is not None:
return [add_referer('eagleplatform:{host}:{id}'.format(**mobj.groupdict()))]
@staticmethod
def _handle_error(response):
status = int_or_none(response.get('status', 200))
if status != 200:
raise ExtractorError(' '.join(response['errors']), expected=True)
def _download_json(self, url_or_request, video_id, *args, **kwargs):
try:
response = super()._download_json(
url_or_request, video_id, *args, **kwargs)
except ExtractorError as ee:
if isinstance(ee.cause, HTTPError):
response = self._parse_json(ee.cause.response.read().decode('utf-8'), video_id)
self._handle_error(response)
raise
return response
def _get_video_url(self, url_or_request, video_id, note='Downloading JSON metadata'):
return self._download_json(url_or_request, video_id, note)['data'][0]
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
mobj = self._match_valid_url(url)
host, video_id = mobj.group('custom_host') or mobj.group('host'), mobj.group('id')
headers = {}
query = {
'id': video_id,
}
referrer = smuggled_data.get('referrer')
if referrer:
headers['Referer'] = referrer
query['referrer'] = referrer
player_data = self._download_json(
f'http://{host}/api/player_data', video_id,
headers=headers, query=query)
media = player_data['data']['playlist']['viewports'][0]['medialist'][0]
title = media['title']
description = media.get('description')
thumbnail = self._proto_relative_url(media.get('snapshot'), 'http:')
duration = int_or_none(media.get('duration'))
view_count = int_or_none(media.get('views'))
age_restriction = media.get('age_restriction')
age_limit = None
if age_restriction:
age_limit = 0 if age_restriction == 'allow_all' else 18
secure_m3u8 = self._proto_relative_url(media['sources']['secure_m3u8']['auto'], 'http:')
formats = []
m3u8_url = self._get_video_url(secure_m3u8, video_id, 'Downloading m3u8 JSON')
m3u8_formats = self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False)
formats.extend(m3u8_formats)
m3u8_formats_dict = {}
for f in m3u8_formats:
if f.get('height') is not None:
m3u8_formats_dict[f['height']] = f
mp4_data = self._download_json(
# Secure mp4 URL is constructed according to Player.prototype.mp4 from
# http://lentaru.media.eagleplatform.com/player/player.js
re.sub(r'm3u8|hlsvod|hls|f4m', 'mp4s', secure_m3u8),
video_id, 'Downloading mp4 JSON', fatal=False)
if mp4_data:
for format_id, format_url in mp4_data.get('data', {}).items():
if not url_or_none(format_url):
continue
height = int_or_none(format_id)
if height is not None and m3u8_formats_dict.get(height):
f = m3u8_formats_dict[height].copy()
f.update({
'format_id': f['format_id'].replace('hls', 'http'),
'protocol': 'http',
})
else:
f = {
'format_id': f'http-{format_id}',
'height': int_or_none(format_id),
}
f['url'] = format_url
formats.append(f)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'view_count': view_count,
'age_limit': age_limit,
'formats': formats,
}
class ClipYouEmbedIE(InfoExtractor):
_VALID_URL = False
@classmethod
def _extract_embed_urls(cls, url, webpage):
mobj = re.search(
r'<iframe[^>]+src="https?://(?P<host>media\.clipyou\.ru)/index/player\?.*\brecord_id=(?P<id>\d+).*"', webpage)
if mobj is not None:
yield smuggle_url('eagleplatform:{host}:{id}'.format(**mobj.groupdict()), {'referrer': url})

View File

@@ -64,14 +64,12 @@ class ERTFlixCodenameIE(ERTFlixBaseIE):
_VALID_URL = r'ertflix:(?P<id>[\w-]+)'
_TESTS = [{
'url': 'ertflix:monogramma-praxitelis-tzanoylinos',
'md5': '5b9c2cd171f09126167e4082fc1dd0ef',
'info_dict': {
'id': 'monogramma-praxitelis-tzanoylinos',
'ext': 'mp4',
'title': 'md5:ef0b439902963d56c43ac83c3f41dd0e',
'title': 'monogramma-praxitelis-tzanoylinos',
},
},
]
}]
def _extract_formats_and_subs(self, video_id):
media_info = self._call_api(video_id, codename=video_id)
@@ -131,13 +129,14 @@ class ERTFlixIE(ERTFlixBaseIE):
'duration': 3166,
'age_limit': 8,
},
'skip': 'Invalid URL',
}, {
'url': 'https://www.ertflix.gr/series/ser.3448-monogramma',
'info_dict': {
'id': 'ser.3448',
'age_limit': 8,
'description': 'Η εκπομπή σαράντα ετών που σημάδεψε τον πολιτισμό μας.',
'title': 'Μονόγραμμα',
'title': 'Monogramma',
'description': 'md5:e30cc640e6463da87f210a8ed10b2439',
},
'playlist_mincount': 64,
}, {
@@ -145,28 +144,28 @@ class ERTFlixIE(ERTFlixBaseIE):
'info_dict': {
'id': 'ser.3448',
'age_limit': 8,
'description': 'Η εκπομπή σαράντα ετών που σημάδεψε τον πολιτισμό μας.',
'title': 'Μονόγραμμα',
'title': 'Monogramma',
'description': 'md5:e30cc640e6463da87f210a8ed10b2439',
},
'playlist_count': 22,
'playlist_mincount': 66,
}, {
'url': 'https://www.ertflix.gr/series/ser.3448-monogramma?season=1&season=2021%20-%202022',
'info_dict': {
'id': 'ser.3448',
'age_limit': 8,
'description': 'Η εκπομπή σαράντα ετών που σημάδεψε τον πολιτισμό μας.',
'title': 'Μονόγραμμα',
'title': 'Monogramma',
'description': 'md5:e30cc640e6463da87f210a8ed10b2439',
},
'playlist_mincount': 36,
'playlist_mincount': 25,
}, {
'url': 'https://www.ertflix.gr/series/ser.164991-to-diktuo-1?season=1-9',
'info_dict': {
'id': 'ser.164991',
'age_limit': 8,
'description': 'Η πρώτη ελληνική εκπομπή με θεματολογία αποκλειστικά γύρω από το ίντερνετ.',
'title': 'Το δίκτυο',
'title': 'The Network',
'description': 'The first Greek show featuring topics exclusively around the internet.',
},
'playlist_mincount': 9,
'playlist_mincount': 0,
}, {
'url': 'https://www.ertflix.gr/en/vod/vod.127652-ta-kalytera-mas-chronia-ep1-mia-volta-sto-feggari',
'only_matching': True,
@@ -282,6 +281,16 @@ class ERTWebtvEmbedIE(InfoExtractor):
'ext': 'mp4',
'thumbnail': 'https://program.ert.gr/photos/2022/1/to_diktio_ep09_i_istoria_tou_diadiktiou_stin_Ellada_1021x576.jpg',
},
'skip': 'Invalid URL',
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.ertnews.gr/video/manolis-goyalles-o-anthropos-piso-apo-ti-diadiktyaki-vasilopita/',
'info_dict': {
'id': '2022/tv/news-themata-ianouarios/20220114-apotis6-gouales-pita.mp4',
'ext': 'mp4',
'title': 'VOD - 2022/tv/news-themata-ianouarios/20220114-apotis6-gouales-pita.mp4',
'thumbnail': r're:https?://www\.ert\.gr/themata/photos/.+\.jpg',
},
}]
def _real_extract(self, url):

View File

@@ -81,13 +81,14 @@ class FacebookIE(InfoExtractor):
'description': 'md5:34675bda53336b1d16400265c2bb9b3b',
'uploader': 'RADIO KICKS FM',
'upload_date': '20230818',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'timestamp': 1692346159,
'thumbnail': r're:^https?://.*',
'uploader_id': '100063551323670',
'duration': 3133.583,
'view_count': int,
'concurrent_view_count': 0,
},
'expected_warnings': ['Cannot parse data'],
}, {
'url': 'https://www.facebook.com/video.php?v=637842556329505&fref=nf',
'md5': '6a40d33c0eccbb1af76cf0485a052659',
@@ -106,17 +107,18 @@ class FacebookIE(InfoExtractor):
'info_dict': {
'id': '274175099429670',
'ext': 'mp4',
'title': 'Asif',
'title': '119 reactions · 1.4K shares | Asif Nawab Butt on Reels',
'description': '',
'uploader': 'Asif Nawab Butt',
'upload_date': '20140506',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'timestamp': 1399398998,
'thumbnail': r're:^https?://.*',
'uploader_id': 'pfbid05AzrFTXgY37tqwaSgbFTTEpCLBjjEJHkigogwGiRPtKEpAsJYJpzE94H1RxYXWEtl',
'uploader_id': 'pfbid028xue38TBXRyNbiqBSV2LFs3QK3yopvKjupbqFoL6U9SKbx4p2SMdJjQSBvnjsHGWl',
'duration': 131.03,
'concurrent_view_count': int,
'view_count': int,
},
'expected_warnings': ['Cannot parse data'],
}, {
'note': 'Video with DASH manifest',
'url': 'https://www.facebook.com/video.php?v=957955867617029',
@@ -158,7 +160,7 @@ class FacebookIE(InfoExtractor):
'id': '10153664894881749',
'ext': 'mp4',
'title': 'Average time to confirm recent Supreme Court nominees: 67 days Longest it\'s t...',
'thumbnail': r're:^https?://.*',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'timestamp': 1456259628,
'upload_date': '20160223',
'uploader': 'Barack Obama',
@@ -168,7 +170,7 @@ class FacebookIE(InfoExtractor):
# have 1080P, but only up to 720p in swf params
# data.video.story.attachments[].media
'url': 'https://www.facebook.com/cnn/videos/10155529876156509/',
'md5': '1659aa21fb3dd1585874f668e81a72c8',
'md5': '70b82ebf5f0e9b91b2a49d3db3563611',
'info_dict': {
'id': '10155529876156509',
'ext': 'mp4',
@@ -177,7 +179,7 @@ class FacebookIE(InfoExtractor):
'timestamp': 1477818095,
'upload_date': '20161030',
'uploader': 'CNN',
'thumbnail': r're:^https?://.*',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'view_count': int,
'uploader_id': '100059479812265',
'concurrent_view_count': int,
@@ -198,13 +200,11 @@ class FacebookIE(InfoExtractor):
'uploader': 'Yaroslav Korpan',
'uploader_id': 'pfbid06AScABAWcW91qpiuGrLt99Ef9tvwHoXP6t8KeFYEqkSfreMtfa9nTveh8b2ZEVSWl',
'concurrent_view_count': int,
'thumbnail': r're:^https?://.*',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'view_count': int,
'duration': 11736.446,
},
'params': {
'skip_download': True,
},
'skip': 'Invalid URL',
}, {
# FIXME: Cannot parse data error
'url': 'https://www.facebook.com/LaGuiaDelVaron/posts/1072691702860471',
@@ -215,7 +215,7 @@ class FacebookIE(InfoExtractor):
'timestamp': 1477305000,
'upload_date': '20161024',
'uploader': 'La Guía Del Varón',
'thumbnail': r're:^https?://.*',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
},
'skip': 'Requires logging in',
}, {
@@ -244,9 +244,10 @@ class FacebookIE(InfoExtractor):
'upload_date': '20171124',
'uploader': 'Vickie Gentry',
'uploader_id': 'pfbid0FkkycT95ySNNyfCw4Cho6u5G7WbbZEcxT496Hq8rtx1K3LcTCATpR3wnyYhmyGC5l',
'thumbnail': r're:^https?://.*',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'duration': 148.224,
},
'skip': 'Invalid URL',
}, {
# data.node.comet_sections.content.story.attachments[].styles.attachment.media
'url': 'https://www.facebook.com/attn/posts/pfbid0j1Czf2gGDVqeQ8KiMLFm3pWN8GxsQmeRrVhimWDzMuKQoR8r4b1knNsejELmUgyhl',
@@ -260,7 +261,7 @@ class FacebookIE(InfoExtractor):
'duration': 132.675,
'uploader_id': '100064451419378',
'view_count': int,
'thumbnail': r're:^https?://.*',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'timestamp': 1701975646,
},
}, {
@@ -271,9 +272,9 @@ class FacebookIE(InfoExtractor):
'ext': 'mp4',
'title': 'Lela Evans',
'description': 'Today Makkovik\'s own Pilot Mandy Smith made her inaugural landing on the airstrip in her hometown. What a proud moment as we all cheered and...',
'thumbnail': r're:^https?://.*',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'uploader': 'Lela Evans',
'uploader_id': 'pfbid0swT2y7t6TAsZVBvcyeYPdhTMefGaS26mzUwML3vd1ma6ndGZKxsyS4Ssu3jitZLXl',
'uploader_id': 'pfbid02wjMpknobSMnyynK3TNKN4Ww1StcpAKXgowqTyge3bz7LwHZMQ68uiXzzbu7xeryBl',
'upload_date': '20231228',
'timestamp': 1703804085,
'duration': 394.347,
@@ -326,28 +327,27 @@ class FacebookIE(InfoExtractor):
'uploader_id': '100066514874195',
'duration': 4524.001,
'view_count': int,
'thumbnail': r're:^https?://.*',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'concurrent_view_count': int,
},
'params': {
'skip_download': True,
},
'params': {'skip_download': True},
}, {
# data.node.comet_sections.content.story.attachments[].style_type_renderer.attachment.all_subattachments.nodes[].media
'url': 'https://www.facebook.com/100033620354545/videos/106560053808006/',
'info_dict': {
'id': '106560053808006',
'ext': 'mp4',
'title': 'Josef',
'thumbnail': r're:^https?://.*',
'title': 'Josef Novak on Reels',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'concurrent_view_count': int,
'uploader_id': 'pfbid02gpfwRM2XvdEJfsERupwQiNmBiDArc38RMRYZnap372q6Vs7MtFTVy72mmFWpJBTKl',
'uploader_id': 'pfbid0cjYJYXpePWqhZ9DgpB6gKXrN2q3obwducdKm4wT7K5nkhbfKg5cneocYbsdaji7fl',
'timestamp': 1549275572,
'duration': 3.283,
'uploader': 'Josef Novak',
'description': '',
'upload_date': '20190204',
},
'expected_warnings': ['Cannot parse data'],
}, {
# data.video.story.attachments[].media
'url': 'https://www.facebook.com/watch/?v=647537299265662',
@@ -406,7 +406,7 @@ class FacebookIE(InfoExtractor):
'ext': 'mp4',
'title': 'ANALISI IN CAMPO OSCURO " Coaguli nel sangue dei vaccinati"',
'description': 'Other event by Comitato Liberi Pensatori on Tuesday, October 18 2022',
'thumbnail': r're:^https?://.*',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'uploader': 'Comitato Liberi Pensatori',
'uploader_id': '100065709540881',
},
@@ -414,6 +414,56 @@ class FacebookIE(InfoExtractor):
'url': 'https://www.facebook.com/groups/1513990329015294/posts/d41d8cd9/2013209885760000/?app=fbl',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
# <iframe> embed
'url': 'http://www.unique-almeria.com/mini-hollywood.html',
'md5': 'cba5d8c5021e9340dcefe925255e2c3e',
'info_dict': {
'id': '1529066599879',
'ext': 'mp4',
'title': 'Facebook video #1529066599879',
},
'expected_warnings': ['unable to extract uploader'],
}, {
# FIXME: Embed detection
# <iframe> embed, plugin video
'url': 'https://www.newsmemory.com/eedition/e-publishing-solutions/2-in-one-app/',
'md5': 'ae97d4a44f8cc9a8b1a4c03b9ed793af',
'info_dict': {
'id': '10155710648695814',
'ext': 'mp4',
'title': 'Download the all new and improved Trinidad Express App',
'concurrent_view_count': int,
'description': 'md5:4806195c99908e4189b45b1c23bd4f89',
'duration': 69.408,
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'timestamp': 1533919195,
'upload_date': '20180810',
'uploader': 'Trinidad Express Newspapers',
'uploader_id': '100064446413648',
'view_count': int,
},
'expected_warnings': ['Cannot parse data'],
}, {
# API embed
'url': 'https://www.curs.md/ro',
'md5': '090bae53b9bff2be993c896edc2ea205',
'info_dict': {
'id': '334484292523563',
'ext': 'mp4',
'title': 'md5:9abffe1c86cdd967ffa224e1ccc13b90',
'concurrent_view_count': int,
'description': 'md5:0ba98567a61c640f9fabf1882235b33d',
'duration': 8789.891,
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'timestamp': 1700603114,
'upload_date': '20231121',
'uploader': 'Istoria Moldovei',
'uploader_id': '100063529778592',
'view_count': int,
},
'params': {'extractor_args': {'generic': {'impersonate': ['chrome']}}},
}]
_SUPPORTED_PAGLETS_REGEX = r'(?:pagelet_group_mall|permalink_video_pagelet|hyperfeed_story_id_[0-9a-f]+)'
_api_config = {
'graphURI': '/api/graphql/',
@@ -898,20 +948,24 @@ class FacebookIE(InfoExtractor):
class FacebookPluginsVideoIE(InfoExtractor):
_VALID_URL = r'https?://(?:[\w-]+\.)?facebook\.com/plugins/video\.php\?.*?\bhref=(?P<id>https.+)'
_TESTS = [{
'url': 'https://www.facebook.com/plugins/video.php?href=https%3A%2F%2Fwww.facebook.com%2Fgov.sg%2Fvideos%2F10154383743583686%2F&show_text=0&width=560',
'md5': '5954e92cdfe51fe5782ae9bda7058a07',
'md5': 'af83aeae1d595f377c6e47a450828155',
'info_dict': {
'id': '10154383743583686',
'ext': 'mp4',
# TODO: Fix title, uploader
'title': 'What to do during the haze?',
'uploader': 'Gov.sg',
'upload_date': '20160826',
'concurrent_view_count': int,
'description': 'md5:81839c0979803a014b20798df255ed0b',
'duration': 65.087,
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'timestamp': 1472184808,
'upload_date': '20160826',
'uploader': 'gov.sg',
'uploader_id': '100064718678925',
'view_count': int,
},
'add_ie': [FacebookIE.ie_key()],
'expected_warnings': ['Cannot parse data'],
}, {
'url': 'https://www.facebook.com/plugins/video.php?href=https%3A%2F%2Fwww.facebook.com%2Fvideo.php%3Fv%3D10204634152394104',
'only_matching': True,
@@ -945,7 +999,7 @@ class FacebookRedirectURLIE(InfoExtractor):
'tags': 'count:11',
'duration': 3332,
'live_status': 'not_live',
'thumbnail': 'https://i.ytimg.com/vi/pO8h3EaFRdo/maxresdefault.jpg',
'thumbnail': r're:https?://i\.ytimg\.com/vi/.+',
'channel_url': 'https://www.youtube.com/channel/UCGBpxWJr9FNOcFYA5GkKrMg',
'availability': 'public',
'uploader_url': 'http://www.youtube.com/user/brtvofficial',
@@ -954,8 +1008,7 @@ class FacebookRedirectURLIE(InfoExtractor):
'view_count': int,
'like_count': int,
},
'add_ie': ['Youtube'],
'params': {'skip_download': 'Youtube'},
'skip': 'Youtube video is now private',
}]
def _real_extract(self, url):
@@ -968,22 +1021,20 @@ class FacebookRedirectURLIE(InfoExtractor):
class FacebookReelIE(InfoExtractor):
_VALID_URL = r'https?://(?:[\w-]+\.)?facebook\.com/reel/(?P<id>\d+)'
IE_NAME = 'facebook:reel'
_TESTS = [{
'url': 'https://www.facebook.com/reel/1195289147628387',
'md5': 'a53256d10fc2105441fe0c4212ed8cea',
'md5': 'aeb0153ecb2eaacdf2dc2bf88f593fef',
'info_dict': {
'id': '1195289147628387',
'ext': 'mp4',
'title': r're:9\.6K views · 355 reactions .+ Let the “Slapathon” commence!! .+ LL COOL J · Mama Said Knock You Out$',
'description': r're:When your trying to help your partner .+ LL COOL J · Mama Said Knock You Out$',
'uploader': 'Beast Camp Training',
'title': '9.7K views · 352 reactions | When your trying to help your partner out with an arrest and #FAAFO games begin. Let the “Slapathon” commence!! 👊👋 | Beast Camp Training',
'description': 'md5:5a767dc7e78718667b150a7facc4a34f',
'uploader': '9.7K views &#xb7; 352 reactions | When your trying to help your partner out with an arrest and #FAAFO games begin. Let the &#x201c;Slapathon&#x201d; commence!! &#x1f44a;&#x1f44b; | Beast Camp Training',
'uploader_id': '100040874179269',
'duration': 9.579,
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'timestamp': 1637502609,
'upload_date': '20211121',
'thumbnail': r're:^https?://.*',
'like_count': int,
'comment_count': int,
'repost_count': int,
},
@@ -998,7 +1049,6 @@ class FacebookReelIE(InfoExtractor):
class FacebookAdsIE(InfoExtractor):
_VALID_URL = r'https?://(?:[\w-]+\.)?facebook\.com/ads/library/?\?(?:[^#]+&)?id=(?P<id>\d+)'
IE_NAME = 'facebook:ads'
_TESTS = [{
'url': 'https://www.facebook.com/ads/library/?id=899206155126718',
'info_dict': {
@@ -1008,12 +1058,13 @@ class FacebookAdsIE(InfoExtractor):
'description': 'md5:0822724069e3aca97cbed5dabbab282e',
'uploader': 'Kandao',
'uploader_id': '774114102743284',
'uploader_url': r're:^https?://.*',
'uploader_url': 'https://facebook.com/KandaoVR',
'timestamp': 1702548330,
'thumbnail': r're:^https?://.*',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'upload_date': '20231214',
'like_count': int,
},
'skip': 'Invalid URL',
}, {
# key 'watermarked_video_sd_url' missing
'url': 'https://www.facebook.com/ads/library/?id=501152689226254',
@@ -1024,9 +1075,9 @@ class FacebookAdsIE(InfoExtractor):
'description': 'md5:02a446ace7ff8c3c37a2892922492490',
'uploader': 'mat.nawrocki',
'uploader_id': '148586968341456',
'uploader_url': r're:^https?://.*',
'uploader_url': 'https://www.instagram.com/_u/mat.nawrocki',
'thumbnail': r're:https?://scontent\.fitm\d-1\.fna\.fbcdn\.net/.+',
'timestamp': 1723452305,
'thumbnail': r're:^https?://.*',
'upload_date': '20240812',
'like_count': int,
},
@@ -1037,12 +1088,13 @@ class FacebookAdsIE(InfoExtractor):
'title': 'Jusqu\u2019\u00e0 -25% sur une s\u00e9lection de vins p\u00e9tillants italiens ',
'uploader': 'Eataly Paris Marais',
'uploader_id': '2086668958314152',
'uploader_url': r're:^https?://.*',
'uploader_url': 'https://facebook.com/EatalyParisMarais',
'timestamp': 1703571529,
'upload_date': '20231226',
'like_count': int,
},
'playlist_count': 3,
'skip': 'Invalid URL',
}, {
'url': 'https://es-la.facebook.com/ads/library/?id=901230958115569',
'only_matching': True,

237
yt_dlp/extractor/faulio.py Normal file
View File

@@ -0,0 +1,237 @@
import re
import urllib.parse
from .common import InfoExtractor
from ..utils import int_or_none, js_to_json, url_or_none
from ..utils.traversal import traverse_obj
class FaulioBaseIE(InfoExtractor):
_DOMAINS = (
'aloula.sba.sa',
'bahry.com',
'maraya.sba.net.ae',
'sat7plus.org',
)
_LANGUAGES = ('ar', 'en', 'fa')
_BASE_URL_RE = fr'https?://(?:{"|".join(map(re.escape, _DOMAINS))})/(?:(?:{"|".join(_LANGUAGES)})/)?'
def _get_headers(self, url):
parsed_url = urllib.parse.urlparse(url)
return {
'Referer': url,
'Origin': f'{parsed_url.scheme}://{parsed_url.hostname}',
}
def _get_api_base(self, url, video_id):
webpage = self._download_webpage(url, video_id)
config_data = self._search_json(
r'window\.__NUXT__\.config=', webpage, 'config', video_id, transform_source=js_to_json)
return config_data['public']['TRANSLATIONS_API_URL']
class FaulioIE(FaulioBaseIE):
_VALID_URL = fr'{FaulioBaseIE._BASE_URL_RE}(?:episode|media)/(?P<id>[a-zA-Z0-9-]+)'
_TESTS = [{
'url': 'https://aloula.sba.sa/en/episode/29102',
'info_dict': {
'id': 'aloula.faulio.com_29102',
'ext': 'mp4',
'display_id': 'هذا-مكانك-03-004-v-29102',
'title': 'الحلقة 4',
'episode': 'الحلقة 4',
'description': '',
'series': 'هذا مكانك',
'season': 'Season 3',
'season_number': 3,
'episode_number': 4,
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 4855,
'age_limit': 3,
},
}, {
'url': 'https://bahry.com/en/media/1191',
'info_dict': {
'id': 'bahry.faulio.com_1191',
'ext': 'mp4',
'display_id': 'Episode-4-1191',
'title': 'Episode 4',
'episode': 'Episode 4',
'description': '',
'series': 'Wild Water',
'season': 'Season 1',
'season_number': 1,
'episode_number': 4,
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1653,
'age_limit': 0,
},
}, {
'url': 'https://maraya.sba.net.ae/episode/127735',
'info_dict': {
'id': 'maraya.faulio.com_127735',
'ext': 'mp4',
'display_id': 'عبدالله-الهاجري---عبدالرحمن-المطروشي-127735',
'title': 'عبدالله الهاجري - عبدالرحمن المطروشي',
'episode': 'عبدالله الهاجري - عبدالرحمن المطروشي',
'description': 'md5:53de01face66d3d6303221e5a49388a0',
'series': 'أبناؤنا في الخارج',
'season': 'Season 3',
'season_number': 3,
'episode_number': 7,
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1316,
'age_limit': 0,
},
}, {
'url': 'https://sat7plus.org/episode/18165',
'info_dict': {
'id': 'sat7.faulio.com_18165',
'ext': 'mp4',
'display_id': 'ep-13-ADHD-18165',
'title': 'ADHD and creativity',
'episode': 'ADHD and creativity',
'description': '',
'series': 'ADHD Podcast',
'season': 'Season 1',
'season_number': 1,
'episode_number': 13,
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 2492,
'age_limit': 0,
},
}, {
'url': 'https://aloula.sba.sa/en/episode/0',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
api_base = self._get_api_base(url, video_id)
video_info = self._download_json(f'{api_base}/video/{video_id}', video_id, fatal=False)
player_info = self._download_json(f'{api_base}/video/{video_id}/player', video_id)
headers = self._get_headers(url)
formats = []
subtitles = {}
if hls_url := traverse_obj(player_info, ('settings', 'protocols', 'hls', {url_or_none})):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
hls_url, video_id, 'mp4', m3u8_id='hls', fatal=False, headers=headers)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
if mpd_url := traverse_obj(player_info, ('settings', 'protocols', 'dash', {url_or_none})):
fmts, subs = self._extract_mpd_formats_and_subtitles(
mpd_url, video_id, mpd_id='dash', fatal=False, headers=headers)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
return {
'id': f'{urllib.parse.urlparse(api_base).hostname}_{video_id}',
**traverse_obj(traverse_obj(video_info, ('blocks', 0)), {
'display_id': ('slug', {str}),
'title': ('title', {str}),
'episode': ('title', {str}),
'description': ('description', {str}),
'series': ('program_title', {str}),
'season_number': ('season_number', {int_or_none}),
'episode_number': ('episode', {int_or_none}),
'thumbnail': ('image', {url_or_none}),
'duration': ('duration', 'total', {int_or_none}),
'age_limit': ('age_rating', {int_or_none}),
}),
'formats': formats,
'subtitles': subtitles,
'http_headers': headers,
}
class FaulioLiveIE(FaulioBaseIE):
_VALID_URL = fr'{FaulioBaseIE._BASE_URL_RE}live/(?P<id>[a-zA-Z0-9-]+)'
_TESTS = [{
'url': 'https://aloula.sba.sa/live/saudiatv',
'info_dict': {
'id': 'aloula.faulio.com_saudiatv',
'title': str,
'description': str,
'ext': 'mp4',
'live_status': 'is_live',
},
'params': {
'skip_download': 'Livestream',
},
}, {
'url': 'https://bahry.com/live/1',
'info_dict': {
'id': 'bahry.faulio.com_1',
'title': str,
'description': str,
'ext': 'mp4',
'live_status': 'is_live',
},
'params': {
'skip_download': 'Livestream',
},
}, {
'url': 'https://maraya.sba.net.ae/live/1',
'info_dict': {
'id': 'maraya.faulio.com_1',
'title': str,
'description': str,
'ext': 'mp4',
'live_status': 'is_live',
},
'params': {
'skip_download': 'Livestream',
},
}, {
'url': 'https://sat7plus.org/live/pars',
'info_dict': {
'id': 'sat7.faulio.com_pars',
'title': str,
'description': str,
'ext': 'mp4',
'live_status': 'is_live',
},
'params': {
'skip_download': 'Livestream',
},
}, {
'url': 'https://sat7plus.org/fa/live/arabic',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
api_base = self._get_api_base(url, video_id)
channel = traverse_obj(
self._download_json(f'{api_base}/channels', video_id),
(lambda k, v: v['url'] == video_id, any))
headers = self._get_headers(url)
formats = []
subtitles = {}
if hls_url := traverse_obj(channel, ('streams', 'hls', {url_or_none})):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
hls_url, video_id, 'mp4', m3u8_id='hls', live=True, fatal=False, headers=headers)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
if mpd_url := traverse_obj(channel, ('streams', 'mpd', {url_or_none})):
fmts, subs = self._extract_mpd_formats_and_subtitles(
mpd_url, video_id, mpd_id='dash', fatal=False, headers=headers)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
return {
'id': f'{urllib.parse.urlparse(api_base).hostname}_{video_id}',
**traverse_obj(channel, {
'title': ('title', {str}),
'description': ('description', {str}),
}),
'formats': formats,
'subtitles': subtitles,
'http_headers': headers,
'is_live': True,
}

View File

@@ -22,8 +22,23 @@ class FC2IE(InfoExtractor):
'md5': 'a6ebe8ebe0396518689d963774a54eb7',
'info_dict': {
'id': '20121103kUan1KHs',
'ext': 'flv',
'title': 'Boxing again with Puff',
'ext': 'mp4',
'thumbnail': r're:https?://.+\.jpe?g',
},
'params': {
'skip_download': 'm3u8',
},
}, {
# Direct video url
'url': 'https://video.fc2.com/content/20121209FP73fxDx',
'md5': '066bdb9b3a56a97f49cbf0d0b8a75a1f',
'info_dict': {
'id': '20121209FP73fxDx',
'title': 'Farewelling The Wiggles Live in Sydney Dec 8 2012',
'ext': 'mp4',
'thumbnail': r're:https?://.+\.jpe?g',
'description': 'Saying goodbye to the Wiggles at their Celebration Concert in Sydney, and what a concert that was!',
},
}, {
'url': 'http://video.fc2.com/en/content/20150125cEva0hDn/',
@@ -104,7 +119,7 @@ class FC2IE(InfoExtractor):
'title': title,
'url': vid_url,
'ext': 'mp4',
'protocol': 'm3u8_native',
'protocol': 'm3u8_native' if vidplaylist.get('type') == 2 else 'https',
'description': description,
'thumbnail': thumbnail,
}

View File

@@ -1,9 +1,7 @@
import urllib.parse
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
float_or_none,
url_or_none,
)
@@ -58,16 +56,7 @@ class FrancaisFacileIE(InfoExtractor):
def _real_extract(self, url):
display_id = urllib.parse.unquote(self._match_id(url))
try: # yt-dlp's default user-agents are too old and blocked by the site
webpage = self._download_webpage(url, display_id, headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:136.0) Gecko/20100101 Firefox/136.0',
})
except ExtractorError as e:
if not isinstance(e.cause, HTTPError) or e.cause.status != 403:
raise
# Retry with impersonation if hardcoded UA is insufficient
webpage = self._download_webpage(url, display_id, impersonate=True)
webpage = self._download_webpage(url, display_id)
data = self._search_json(
r'<script[^>]+\bdata-media-id=[^>]+\btype="application/json"[^>]*>',

View File

@@ -363,13 +363,7 @@ class FranceTVSiteIE(FranceTVBaseInfoExtractor):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
nextjs_data = self._search_nextjs_v13_data(webpage, display_id)
if get_first(nextjs_data, ('isLive', {bool})):
# For livestreams we need the id of the stream instead of the currently airing episode id
video_id = get_first(nextjs_data, ('options', 'id', {str}))
else:
video_id = get_first(nextjs_data, ('video', ('playerReplayId', 'siId'), {str}))
video_id = get_first(nextjs_data, ('options', 'id', {str}))
if not video_id:
raise ExtractorError('Unable to extract video ID')

File diff suppressed because it is too large Load Diff

View File

@@ -112,16 +112,17 @@ class GlomexIE(GlomexBaseIE):
_TESTS = [{
'url': 'https://video.glomex.com/sport/v-cb24uwg77hgh-nach-2-0-sieg-guardiola-mit-mancity-vor-naechstem-titel',
'md5': 'cec33a943c4240c9cb33abea8c26242e',
'info_dict': {
'id': 'v-cb24uwg77hgh',
'ext': 'mp4',
'title': 'md5:38a90cedcfadd72982c81acf13556e0c',
'title': 'Nach 2:0-Sieg: Guardiola mit ManCity vor nächstem Titel',
'description': 'md5:1ea6b6caff1443fcbbba159e432eedb8',
'duration': 29600,
'thumbnail': r're:https?://i[a-z0-9]thumbs\.glomex\.com/.+',
'timestamp': 1619895017,
'upload_date': '20210501',
},
'params': {'skip_download': 'm3u8'},
}]
def _real_extract(self, url):
@@ -140,16 +141,17 @@ class GlomexEmbedIE(GlomexBaseIE):
_TESTS = [{
'url': 'https://player.glomex.com/integration/1/iframe-player.html?integrationId=4059a013k56vb2yd&playlistId=v-cfa6lye0dkdd-sf',
'md5': '68f259b98cc01918ac34180142fce287',
'info_dict': {
'id': 'v-cfa6lye0dkdd-sf',
'ext': 'mp4',
'title': 'Φώφη Γεννηματά: Ο επικήδειος λόγος του 17χρονου γιου της, Γιώργου',
'thumbnail': r're:https?://i[a-z0-9]thumbs\.glomex\.com/.+',
'timestamp': 1635337199,
'duration': 133080,
'upload_date': '20211027',
'description': 'md5:e741185fc309310ff5d0c789b437be66',
'title': 'md5:35647293513a6c92363817a0fb0a7961',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://player.glomex.com/integration/1/iframe-player.html?origin=fullpage&integrationId=19syy24xjn1oqlpc&playlistId=rl-vcb49w1fb592p&playlistIndex=0',
'info_dict': {
@@ -157,12 +159,27 @@ class GlomexEmbedIE(GlomexBaseIE):
},
'playlist_count': 100,
}, {
# Geo-restricted
'url': 'https://player.glomex.com/integration/1/iframe-player.html?playlistId=cl-bgqaata6aw8x&integrationId=19syy24xjn1oqlpc',
'info_dict': {
'id': 'cl-bgqaata6aw8x',
},
'playlist_mincount': 2,
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.skai.gr/news/world/iatrikos-syllogos-tourkias-to-turkovac-aplo-dialyma-erntogan-eiste-apateones-kai-pseytes',
'info_dict': {
'id': 'v-ch2nkhcirwc9-sf',
'ext': 'mp4',
'title': 'Ιατρικός Σύλλογος Τουρκίας: Το Turkovac είναι ένα απλό διάλυμα –Ερντογάν: Είστε απατεώνες και ψεύτες',
'description': 'md5:8b517a61d577efe7e36fde72fd535995',
'duration': 460000,
'thumbnail': r're:https?://i[a-z0-9]thumbs\.glomex\.com/.+',
'timestamp': 1641885019,
'upload_date': '20220111',
},
'params': {'skip_download': 'm3u8'},
}]
@classmethod
def build_player_url(cls, video_id, integration, origin_url=None):

View File

@@ -11,7 +11,6 @@ from ..utils import (
class IndavideoEmbedIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:embed\.)?indavideo\.hu/player/video/|assets\.indavideo\.hu/swf/player\.swf\?.*\b(?:v(?:ID|id))=)(?P<id>[\da-f]+)'
# Some example URLs covered by generic extractor:
# https://indavideo.hu/video/Vicces_cica_1
# https://index.indavideo.hu/video/Hod_Nemetorszagban
# https://auto.indavideo.hu/video/Sajat_utanfutoban_a_kis_tacsko
# https://film.indavideo.hu/video/f_farkaslesen
@@ -25,14 +24,14 @@ class IndavideoEmbedIE(InfoExtractor):
'ext': 'mp4',
'title': 'Cicatánc',
'description': '',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'cukiajanlo',
'uploader_id': '83729',
'thumbnail': r're:https?://pics\.indavideo\.hu/videos/.+\.jpg',
'timestamp': 1439193826,
'upload_date': '20150810',
'duration': 72,
'age_limit': 0,
'tags': ['tánc', 'cica', 'cuki', 'cukiajanlo', 'newsroom'],
'tags': 'count:5',
},
}, {
'url': 'https://embed.indavideo.hu/player/video/1bdc3c6d80?autostart=1&hide=1',
@@ -45,14 +44,30 @@ class IndavideoEmbedIE(InfoExtractor):
'ext': 'mp4',
'title': 'Vicces cica',
'description': 'Játszik a tablettel. :D',
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:https?://pics\.indavideo\.hu/videos/.+\.jpg',
'uploader': 'Jet_Pack',
'uploader_id': '491217',
'timestamp': 1390821212,
'upload_date': '20140127',
'duration': 7,
'age_limit': 0,
'tags': ['cica', 'Jet_Pack'],
'tags': 'count:2',
},
}, {
'url': 'https://palyazat.indavideo.hu/video/RUSH_1',
'info_dict': {
'id': '3808180',
'ext': 'mp4',
'title': 'RUSH',
'age_limit': 0,
'description': '',
'duration': 650,
'tags': 'count:2',
'thumbnail': r're:https?://pics\.indavideo\.hu/videos/.+\.jpg',
'timestamp': 1729136266,
'upload_date': '20241017',
'uploader': '7summerfilms',
'uploader_id': '1628496',
},
}]

View File

@@ -22,18 +22,17 @@ class JojIE(InfoExtractor):
'id': 'a388ec4c-6019-4a4a-9312-b1bee194e932',
'ext': 'mp4',
'title': 'NOVÉ BÝVANIE',
'thumbnail': r're:^https?://.*?$',
'duration': 3118,
'thumbnail': r're:https?://img\.joj\.sk/.+',
},
}, {
'url': 'https://media.joj.sk/embed/CSM0Na0l0p1',
'info_dict': {
'id': 'CSM0Na0l0p1',
'ext': 'mp4',
'height': 576,
'title': 'Extrémne rodiny 2 - POKRAČOVANIE (2012/04/09 21:30:00)',
'duration': 3937,
'thumbnail': r're:^https?://.*?$',
'thumbnail': r're:https?://img\.joj\.sk/.+',
},
}, {
'url': 'https://media.joj.sk/embed/9i1cxv',
@@ -45,6 +44,15 @@ class JojIE(InfoExtractor):
'url': 'joj:9i1cxv',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
# FIXME: Embed detection
'url': 'https://www.noviny.sk/slovensko/238543-slovenskom-sa-prehnala-vlna-silnych-burok',
'info_dict': {
'id': '238543-slovenskom-sa-prehnala-vlna-silnych-burok',
'title': 'Slovenskom sa prehnala vlna silných búrok',
},
'playlist_mincount': 5,
}]
def _real_extract(self, url):
video_id = self._match_id(url)

View File

@@ -8,7 +8,6 @@ class JWPlatformIE(InfoExtractor):
_VALID_URL = r'(?:https?://(?:content\.jwplatform|cdn\.jwplayer)\.com/(?:(?:feed|player|thumb|preview|manifest)s|jw6|v2/media)/|jwplatform:)(?P<id>[a-zA-Z0-9]{8})'
_TESTS = [{
'url': 'http://content.jwplatform.com/players/nPripu9l-ALJ3XQCI.js',
'md5': '3aa16e4f6860e6e78b7df5829519aed3',
'info_dict': {
'id': 'nPripu9l',
'ext': 'mp4',
@@ -17,13 +16,12 @@ class JWPlatformIE(InfoExtractor):
'upload_date': '20081127',
'timestamp': 1227796140,
'duration': 32.0,
'thumbnail': 'https://cdn.jwplayer.com/v2/media/nPripu9l/poster.jpg?width=720',
'thumbnail': r're:https?://cdn\.jwplayer\.com/v2/media/.+',
},
}, {
'url': 'https://cdn.jwplayer.com/players/nPripu9l-ALJ3XQCI.js',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
# JWPlatform iframe
'url': 'https://www.covermagazine.co.uk/feature/2465255/business-protection-involved',
@@ -33,10 +31,11 @@ class JWPlatformIE(InfoExtractor):
'upload_date': '20160719',
'timestamp': 1468923808,
'title': '2016_05_18 Cover L&G Business Protection V1 FINAL.mp4',
'thumbnail': 'https://cdn.jwplayer.com/v2/media/AG26UQXM/poster.jpg?width=720',
'thumbnail': r're:https?://cdn\.jwplayer\.com/v2/media/.+',
'description': '',
'duration': 294.0,
},
'skip': 'Site no longer embeds JWPlatform',
}, {
# Player url not surrounded by quotes
'url': 'https://www.deutsche-kinemathek.de/en/online/streaming/school-trip',
@@ -45,12 +44,12 @@ class JWPlatformIE(InfoExtractor):
'title': 'Klassenfahrt',
'ext': 'mp4',
'upload_date': '20230109',
'thumbnail': 'https://cdn.jwplayer.com/v2/media/jUxh5uin/poster.jpg?width=720',
'thumbnail': r're:https?://cdn\.jwplayer\.com/v2/media/.+',
'timestamp': 1673270298,
'description': '',
'duration': 5193.0,
},
'params': {'allowed_extractors': ['generic', 'jwplatform']},
'skip': 'Site no longer embeds JWPlatform',
}, {
# iframe src attribute includes backslash before URL string
'url': 'https://www.elespectador.com/colombia/video-asi-se-evito-la-fuga-de-john-poulos-presunto-feminicida-de-valentina-trespalacios-explicacion',
@@ -59,11 +58,24 @@ class JWPlatformIE(InfoExtractor):
'title': 'Así se evitó la fuga de John Poulos, presunto feminicida de Valentina Trespalacios',
'ext': 'mp4',
'upload_date': '20230127',
'thumbnail': 'https://cdn.jwplayer.com/v2/media/QD3gsexj/poster.jpg?width=720',
'thumbnail': r're:https?://cdn\.jwplayer\.com/v2/media/.+',
'timestamp': 1674862986,
'description': 'md5:128fd74591c4e1fc2da598c5cb6f5ce4',
'duration': 263.0,
},
}, {
'url': 'https://www.skimag.com/video/ski-people-1980',
'info_dict': {
'id': 'YTmgRiNU',
'ext': 'mp4',
'title': 'Ski People (1980)',
'channel': 'snow',
'description': 'md5:cf9c3d101452c91e141f292b19fe4843',
'duration': 5688.0,
'thumbnail': r're:https?://cdn\.jwplayer\.com/v2/media/.+',
'timestamp': 1610407738,
'upload_date': '20210111',
},
}]
@classmethod

View File

@@ -41,149 +41,188 @@ class KalturaIE(InfoExtractor):
2: 'ttml',
3: 'vtt',
}
_TESTS = [
{
'url': 'kaltura:269692:1_1jc2y3e4',
'md5': '3adcbdb3dcc02d647539e53f284ba171',
'info_dict': {
'id': '1_1jc2y3e4',
'ext': 'mp4',
'title': 'Straight from the Heart',
'upload_date': '20131219',
'uploader_id': 'mlundberg@wolfgangsvault.com',
'description': 'The Allman Brothers Band, 12/16/1981',
'thumbnail': 're:^https?://.*/thumbnail/.*',
'timestamp': int,
_TESTS = [{
'url': 'kaltura:269692:1_1jc2y3e4',
'md5': '3adcbdb3dcc02d647539e53f284ba171',
'info_dict': {
'id': '1_1jc2y3e4',
'ext': 'mp4',
'title': 'Straight from the Heart',
'upload_date': '20131219',
'uploader_id': 'mlundberg@wolfgangsvault.com',
'description': 'The Allman Brothers Band, 12/16/1981',
'thumbnail': r're:https?://.+/thumbnail/.+',
'timestamp': int,
},
'skip': 'The access to this service is forbidden since the specified partner is blocked',
}, {
'url': 'http://www.kaltura.com/index.php/kwidget/cache_st/1300318621/wid/_269692/uiconf_id/3873291/entry_id/1_1jc2y3e4',
'only_matching': True,
}, {
'url': 'https://cdnapisec.kaltura.com/index.php/kwidget/wid/_557781/uiconf_id/22845202/entry_id/1_plr1syf3',
'only_matching': True,
}, {
'url': 'https://cdnapisec.kaltura.com/html5/html5lib/v2.30.2/mwEmbedFrame.php/p/1337/uiconf_id/20540612/entry_id/1_sf5ovm7u?wid=_243342',
'only_matching': True,
}, {
# video with subtitles
'url': 'kaltura:111032:1_cw786r8q',
'only_matching': True,
}, {
# video with ttml subtitles (no fileExt)
'url': 'kaltura:1926081:0_l5ye1133',
'info_dict': {
'id': '0_l5ye1133',
'ext': 'mp4',
'title': 'What Can You Do With Python?',
'upload_date': '20160221',
'uploader_id': 'stork',
'thumbnail': r're:https?://.+/thumbnail/.+',
'timestamp': int,
'subtitles': {
'en': [{
'ext': 'ttml',
}],
},
'skip': 'The access to this service is forbidden since the specified partner is blocked',
},
{
'url': 'http://www.kaltura.com/index.php/kwidget/cache_st/1300318621/wid/_269692/uiconf_id/3873291/entry_id/1_1jc2y3e4',
'only_matching': True,
'skip': 'Gone. Maybe https://www.safaribooksonline.com/library/tutorials/introduction-to-python-anon/3469/',
'params': {'skip_download': True},
}, {
'url': 'https://www.kaltura.com/index.php/extwidget/preview/partner_id/1770401/uiconf_id/37307382/entry_id/0_58u8kme7/embed/iframe?&flashvars[streamerType]=auto',
'only_matching': True,
}, {
'url': 'https://www.kaltura.com:443/index.php/extwidget/preview/partner_id/1770401/uiconf_id/37307382/entry_id/0_58u8kme7/embed/iframe?&flashvars[streamerType]=auto',
'only_matching': True,
}, {
# unavailable source format
'url': 'kaltura:513551:1_66x4rg7o',
'only_matching': True,
}, {
# html5lib URL using kwidget player
'url': 'https://cdnapisec.kaltura.com/html5/html5lib/v2.46/mwEmbedFrame.php/p/691292/uiconf_id/20499062/entry_id/0_c076mna6?wid=_691292&iframeembed=true&playerId=kaltura_player_1420508608&entry_id=0_c076mna6&flashvars%5BakamaiHD.loadingPolicy%5D=preInitialize&flashvars%5BakamaiHD.asyncInit%5D=true&flashvars%5BstreamerType%5D=hdnetwork',
'info_dict': {
'id': '0_c076mna6',
'ext': 'mp4',
'title': 'md5:4883e7acbcbf42583a2dddc97dee4855',
'duration': 3608,
'uploader_id': 'commons@swinburne.edu.au',
'timestamp': 1408086874,
'view_count': int,
'upload_date': '20140815',
'thumbnail': r're:https?://cfvod\.kaltura\.com/.+',
},
{
'url': 'https://cdnapisec.kaltura.com/index.php/kwidget/wid/_557781/uiconf_id/22845202/entry_id/1_plr1syf3',
'only_matching': True,
}, {
# html5lib playlist URL using kwidget player
'url': 'https://cdnapisec.kaltura.com/html5/html5lib/v2.89/mwEmbedFrame.php/p/2019031/uiconf_id/40436601?wid=1_4j3m32cv&iframeembed=true&playerId=kaltura_player_&flashvars[playlistAPI.kpl0Id]=1_jovey5nu&flashvars[ks]=&&flashvars[imageDefaultDuration]=30&flashvars[localizationCode]=en&flashvars[leadWithHTML5]=true&flashvars[forceMobileHTML5]=true&flashvars[nextPrevBtn.plugin]=true&flashvars[hotspots.plugin]=true&flashvars[sideBarContainer.plugin]=true&flashvars[sideBarContainer.position]=left&flashvars[sideBarContainer.clickToClose]=true&flashvars[chapters.plugin]=true&flashvars[chapters.layout]=vertical&flashvars[chapters.thumbnailRotator]=false&flashvars[streamSelector.plugin]=true&flashvars[EmbedPlayer.SpinnerTarget]=videoHolder&flashvars[dualScreen.plugin]=true&flashvars[playlistAPI.playlistUrl]=https://canvasgatechtest.kaf.kaltura.com/playlist/details/{playlistAPI.kpl0Id}/categoryid/126428551',
'info_dict': {
'id': '1_jovey5nu',
'title': '00-00 Introduction',
},
{
'url': 'https://cdnapisec.kaltura.com/html5/html5lib/v2.30.2/mwEmbedFrame.php/p/1337/uiconf_id/20540612/entry_id/1_sf5ovm7u?wid=_243342',
'only_matching': True,
},
{
# video with subtitles
'url': 'kaltura:111032:1_cw786r8q',
'only_matching': True,
},
{
# video with ttml subtitles (no fileExt)
'url': 'kaltura:1926081:0_l5ye1133',
'info_dict': {
'id': '0_l5ye1133',
'ext': 'mp4',
'title': 'What Can You Do With Python?',
'upload_date': '20160221',
'uploader_id': 'stork',
'thumbnail': 're:^https?://.*/thumbnail/.*',
'timestamp': int,
'subtitles': {
'en': [{
'ext': 'ttml',
}],
'playlist': [
{
'info_dict': {
'id': '1_b1y5hlvx',
'ext': 'mp4',
'title': 'CS7646_00-00 Introductio_Introduction',
'duration': 91,
'thumbnail': r're:https?://cfvod\.kaltura\.com/.+',
'view_count': int,
'timestamp': 1533154447,
'upload_date': '20180801',
'uploader_id': 'djoyner3',
},
}, {
'info_dict': {
'id': '1_jfb7mdpn',
'ext': 'mp4',
'title': 'CS7646_00-00 Introductio_Three parts to the course',
'duration': 63,
'thumbnail': r're:https?://cfvod\.kaltura\.com/.+',
'view_count': int,
'timestamp': 1533154489,
'upload_date': '20180801',
'uploader_id': 'djoyner3',
},
}, {
'info_dict': {
'id': '1_8xflxdp7',
'ext': 'mp4',
'title': 'CS7646_00-00 Introductio_Textbooks',
'duration': 37,
'thumbnail': r're:https?://cfvod\.kaltura\.com/.+',
'view_count': int,
'timestamp': 1533154512,
'upload_date': '20180801',
'uploader_id': 'djoyner3',
},
}, {
'info_dict': {
'id': '1_3hqew8kn',
'ext': 'mp4',
'title': 'CS7646_00-00 Introductio_Prerequisites',
'duration': 49,
'thumbnail': r're:https?://cfvod\.kaltura\.com/.+',
'view_count': int,
'timestamp': 1533154536,
'upload_date': '20180801',
'uploader_id': 'djoyner3',
},
},
'skip': 'Gone. Maybe https://www.safaribooksonline.com/library/tutorials/introduction-to-python-anon/3469/',
'params': {
'skip_download': True,
},
],
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.cornell.edu/VIDEO/nima-arkani-hamed-standard-models-of-particle-physics',
'info_dict': {
'id': '1_sgtvehim',
'ext': 'mp4',
'title': 'Our "Standard Models" of particle physics and cosmology',
'duration': 5420,
'thumbnail': r're:https?://cdnsecakmi\.kaltura\.com/.+',
'timestamp': 1321158993,
'upload_date': '20111113',
'uploader_id': 'kps1',
'view_count': int,
},
{
'url': 'https://www.kaltura.com/index.php/extwidget/preview/partner_id/1770401/uiconf_id/37307382/entry_id/0_58u8kme7/embed/iframe?&flashvars[streamerType]=auto',
'only_matching': True,
}, {
'url': 'https://www.oreilly.com/ideas/my-cloud-makes-pretty-pictures',
'info_dict': {
'id': '0_utuok90b',
'ext': 'mp4',
'title': '06_matthew_brender_raj_dutt',
'duration': 331,
'thumbnail': r're:https?://cfvod\.kaltura\.com/.+',
'timestamp': 1466638791,
'upload_date': '20160622',
'uploader_id': '',
'view_count': int,
},
{
'url': 'https://www.kaltura.com:443/index.php/extwidget/preview/partner_id/1770401/uiconf_id/37307382/entry_id/0_58u8kme7/embed/iframe?&flashvars[streamerType]=auto',
'only_matching': True,
}, {
'url': 'https://fod.infobase.com/p_ViewPlaylist.aspx?AssignmentID=NUN8ZY',
'info_dict': {
'id': '0_izeg5utt',
'ext': 'mp4',
'title': '35871',
'duration': 3403,
'thumbnail': r're:https?://cfvod\.kaltura\.com/.+',
'timestamp': 1355743100,
'upload_date': '20121217',
'uploader_id': 'cplapp@learn360.com',
'view_count': int,
},
{
# unavailable source format
'url': 'kaltura:513551:1_66x4rg7o',
'only_matching': True,
}, {
'url': 'https://www.cns.nyu.edu/~eero/math-tools17/Videos/lecture-05sep2017.html',
'info_dict': {
'id': '1_9gzouybz',
'ext': 'mp4',
'title': 'lecture-05sep2017',
'duration': 7219,
'thumbnail': r're:https?://cfvod\.kaltura\.com/.+',
'timestamp': 1505340777,
'upload_date': '20170913',
'uploader_id': 'eps2',
'view_count': int,
},
{
# html5lib URL using kwidget player
'url': 'https://cdnapisec.kaltura.com/html5/html5lib/v2.46/mwEmbedFrame.php/p/691292/uiconf_id/20499062/entry_id/0_c076mna6?wid=_691292&iframeembed=true&playerId=kaltura_player_1420508608&entry_id=0_c076mna6&flashvars%5BakamaiHD.loadingPolicy%5D=preInitialize&flashvars%5BakamaiHD.asyncInit%5D=true&flashvars%5BstreamerType%5D=hdnetwork',
'info_dict': {
'id': '0_c076mna6',
'ext': 'mp4',
'title': 'md5:4883e7acbcbf42583a2dddc97dee4855',
'duration': 3608,
'uploader_id': 'commons@swinburne.edu.au',
'timestamp': 1408086874,
'view_count': int,
'upload_date': '20140815',
'thumbnail': 'http://cfvod.kaltura.com/p/691292/sp/69129200/thumbnail/entry_id/0_c076mna6/version/100022',
},
},
{
# html5lib playlist URL using kwidget player
'url': 'https://cdnapisec.kaltura.com/html5/html5lib/v2.89/mwEmbedFrame.php/p/2019031/uiconf_id/40436601?wid=1_4j3m32cv&iframeembed=true&playerId=kaltura_player_&flashvars[playlistAPI.kpl0Id]=1_jovey5nu&flashvars[ks]=&&flashvars[imageDefaultDuration]=30&flashvars[localizationCode]=en&flashvars[leadWithHTML5]=true&flashvars[forceMobileHTML5]=true&flashvars[nextPrevBtn.plugin]=true&flashvars[hotspots.plugin]=true&flashvars[sideBarContainer.plugin]=true&flashvars[sideBarContainer.position]=left&flashvars[sideBarContainer.clickToClose]=true&flashvars[chapters.plugin]=true&flashvars[chapters.layout]=vertical&flashvars[chapters.thumbnailRotator]=false&flashvars[streamSelector.plugin]=true&flashvars[EmbedPlayer.SpinnerTarget]=videoHolder&flashvars[dualScreen.plugin]=true&flashvars[playlistAPI.playlistUrl]=https://canvasgatechtest.kaf.kaltura.com/playlist/details/{playlistAPI.kpl0Id}/categoryid/126428551',
'info_dict': {
'id': '1_jovey5nu',
'title': '00-00 Introduction',
},
'playlist': [
{
'info_dict': {
'id': '1_b1y5hlvx',
'ext': 'mp4',
'title': 'CS7646_00-00 Introductio_Introduction',
'duration': 91,
'thumbnail': 'http://cfvod.kaltura.com/p/2019031/sp/201903100/thumbnail/entry_id/1_b1y5hlvx/version/100001',
'view_count': int,
'timestamp': 1533154447,
'upload_date': '20180801',
'uploader_id': 'djoyner3',
},
}, {
'info_dict': {
'id': '1_jfb7mdpn',
'ext': 'mp4',
'title': 'CS7646_00-00 Introductio_Three parts to the course',
'duration': 63,
'thumbnail': 'http://cfvod.kaltura.com/p/2019031/sp/201903100/thumbnail/entry_id/1_jfb7mdpn/version/100001',
'view_count': int,
'timestamp': 1533154489,
'upload_date': '20180801',
'uploader_id': 'djoyner3',
},
}, {
'info_dict': {
'id': '1_8xflxdp7',
'ext': 'mp4',
'title': 'CS7646_00-00 Introductio_Textbooks',
'duration': 37,
'thumbnail': 'http://cfvod.kaltura.com/p/2019031/sp/201903100/thumbnail/entry_id/1_8xflxdp7/version/100001',
'view_count': int,
'timestamp': 1533154512,
'upload_date': '20180801',
'uploader_id': 'djoyner3',
},
}, {
'info_dict': {
'id': '1_3hqew8kn',
'ext': 'mp4',
'title': 'CS7646_00-00 Introductio_Prerequisites',
'duration': 49,
'thumbnail': 'http://cfvod.kaltura.com/p/2019031/sp/201903100/thumbnail/entry_id/1_3hqew8kn/version/100001',
'view_count': int,
'timestamp': 1533154536,
'upload_date': '20180801',
'uploader_id': 'djoyner3',
},
},
],
},
]
}]
@classmethod
def _extract_embed_urls(cls, url, webpage):

View File

@@ -89,6 +89,15 @@ class KinjaEmbedIE(InfoExtractor):
'url': 'https://kinja.com/ajax/inset/iframe?id=youtube-video-00QyL0AgPAE',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'http://www.clickhole.com/video/dont-understand-bitcoin-man-will-mumble-explanatio-2537',
'info_dict': {
'id': '106351',
'ext': 'mp4',
'title': 'Dont Understand Bitcoin? This Man Will Mumble An Explanation At You',
},
'skip': 'Invalid URL',
}]
_JWPLATFORM_PROVIDER = ('cdn.jwplayer.com/v2/media/', 'JWPlatform')
_PROVIDER_MAP = {
'fb': ('facebook.com/video.php?v=', 'Facebook'),

View File

@@ -1,8 +1,8 @@
from .arkena import ArkenaIE
from .common import InfoExtractor
class LcpPlayIE(ArkenaIE): # XXX: Do not subclass from concrete IE
class LcpPlayIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'https?://play\.lcp\.fr/embed/(?P<id>[^/]+)/(?P<account_id>[^/]+)/[^/]+/[^/]+'
_TESTS = [{
'url': 'http://play.lcp.fr/embed/327336/131064/darkmatter/0',
@@ -21,24 +21,9 @@ class LcpPlayIE(ArkenaIE): # XXX: Do not subclass from concrete IE
class LcpIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'https?://(?:www\.)?lcp\.fr/(?:[^/]+/)*(?P<id>[^/]+)'
_TESTS = [{
# arkena embed
'url': 'http://www.lcp.fr/la-politique-en-video/schwartzenberg-prg-preconise-francois-hollande-de-participer-une-primaire',
'md5': 'b8bd9298542929c06c1c15788b1f277a',
'info_dict': {
'id': 'd56d03e9',
'ext': 'mp4',
'title': 'Schwartzenberg (PRG) préconise à François Hollande de participer à une primaire à gauche',
'description': 'md5:96ad55009548da9dea19f4120c6c16a8',
'timestamp': 1456488895,
'upload_date': '20160226',
},
'params': {
'skip_download': True,
},
}, {
# dailymotion live stream
'url': 'http://www.lcp.fr/le-direct',
'info_dict': {

View File

@@ -18,12 +18,10 @@ class LibsynIE(InfoExtractor):
'info_dict': {
'id': '6385796',
'ext': 'mp3',
'title': 'Champion Minded - Developing a Growth Mindset',
# description fetched using another request:
# http://html5-player.libsyn.com/embed/getitemdetails?item_id=6385796
# 'description': 'In this episode, Allistair talks about the importance of developing a growth mindset, not only in sports, but in life too.',
'title': 'The Allistair McCaw Podcast - Developing a Growth Mindset',
'duration': 834.0,
'thumbnail': r're:https?://assets\.libsyn\.com/.+',
'upload_date': '20180320',
'thumbnail': 're:^https?://.*',
},
}, {
'url': 'https://html5-player.libsyn.com/embed/episode/id/3727166/height/75/width/200/theme/standard/direction/no/autoplay/no/autonext/no/thumbnail/no/preload/no/no_addthis/no/',
@@ -32,8 +30,32 @@ class LibsynIE(InfoExtractor):
'id': '3727166',
'ext': 'mp3',
'title': 'Clients From Hell Podcast - How a Sex Toy Company Kickstarted my Freelance Career',
'thumbnail': r're:https?://assets\.libsyn\.com/.+',
'upload_date': '20150818',
'thumbnail': 're:^https?://.*',
},
'skip': 'Invalid URL',
}]
_WEBPAGE_TESTS = [{
'url': 'https://html5-player.libsyn.com/',
'md5': '50cff329596b8f674d4449ed077ef2f9',
'info_dict': {
'id': '2378831',
'ext': 'mp3',
'title': 'md5:54108b15f98e1b4056612c10b50106b2',
'duration': 3561.0,
'thumbnail': r're:https?://assets\.libsyn\.com/.+',
'upload_date': '20130630',
},
}, {
'url': 'https://undergroundwellness.com/podcasts/306-5-steps-to-permanent-gut-healing/',
'md5': '23576952577f9604520a730d90371761',
'info_dict': {
'id': '3793998',
'ext': 'mp3',
'title': 'Underground Wellness Radio - Jack Tips: 5 Steps to Permanent Gut Healing',
'duration': 3989.0,
'thumbnail': r're:https?://assets\.libsyn\.com/.+',
'upload_date': '20141126',
},
}]

View File

@@ -3,6 +3,7 @@ from ..utils import int_or_none
class LiveJournalIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'https?://(?:[^.]+\.)?livejournal\.com/video/album/\d+.+?\bid=(?P<id>\d+)'
_TEST = {
'url': 'https://andrei-bt.livejournal.com/video/album/407/?mode=view&id=51272',

View File

@@ -16,91 +16,103 @@ class MainStreamingIE(InfoExtractor):
_EMBED_REGEX = [rf'<iframe[^>]+?src=["\']?(?P<url>{_VALID_URL})["\']?']
IE_DESC = 'MainStreaming Player'
_TESTS = [
{
# Live stream offline, has alternative content id
'url': 'https://webtools-e18da6642b684f8aa9ae449862783a56.msvdn.net/embed/53EN6GxbWaJC',
'info_dict': {
'id': '53EN6GxbWaJC',
'title': 'Diretta homepage 2021-12-31 12:00',
'description': '',
'live_status': 'was_live',
'ext': 'mp4',
'thumbnail': r're:https?://[A-Za-z0-9-]*\.msvdn.net/image/\w+/poster',
},
'expected_warnings': [
'Ignoring alternative content ID: WDAF1KOWUpH3',
'MainStreaming said: Live event is OFFLINE',
],
'skip': 'live stream offline',
}, {
# playlist
'url': 'https://webtools-e18da6642b684f8aa9ae449862783a56.msvdn.net/embed/WDAF1KOWUpH3',
'info_dict': {
'id': 'WDAF1KOWUpH3',
'title': 'Playlist homepage',
},
'playlist_mincount': 2,
}, {
# livestream
'url': 'https://webtools-859c1818ed614cc5b0047439470927b0.msvdn.net/embed/tDoFkZD3T1Lw',
'info_dict': {
'id': 'tDoFkZD3T1Lw',
'title': r're:Class CNBC Live \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'live_status': 'is_live',
'ext': 'mp4',
'thumbnail': r're:https?://[A-Za-z0-9-]*\.msvdn.net/image/\w+/poster',
},
'skip': 'live stream',
}, {
'url': 'https://webtools-f5842579ff984c1c98d63b8d789673eb.msvdn.net/embed/EUlZfGWkGpOd?autoPlay=false',
'info_dict': {
'id': 'EUlZfGWkGpOd',
'title': 'La Settimana ',
'description': '03 Ottobre ore 02:00',
'ext': 'mp4',
'live_status': 'not_live',
'thumbnail': r're:https?://[A-Za-z0-9-]*\.msvdn.net/image/\w+/poster',
'duration': 1512,
},
}, {
# video without webtools- prefix
'url': 'https://f5842579ff984c1c98d63b8d789673eb.msvdn.net/embed/MfuWmzL2lGkA?autoplay=false&T=1635860445',
'info_dict': {
'id': 'MfuWmzL2lGkA',
'title': 'TG Mattina',
'description': '06 Ottobre ore 08:00',
'ext': 'mp4',
'live_status': 'not_live',
'thumbnail': r're:https?://[A-Za-z0-9-]*\.msvdn.net/image/\w+/poster',
'duration': 789.04,
},
}, {
# always-on livestream with DVR
'url': 'https://webtools-f5842579ff984c1c98d63b8d789673eb.msvdn.net/embed/HVvPMzy',
'info_dict': {
'id': 'HVvPMzy',
'title': r're:^Diretta LaC News24 \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'description': 'canale all news',
'live_status': 'is_live',
'ext': 'mp4',
'thumbnail': r're:https?://[A-Za-z0-9-]*\.msvdn.net/image/\w+/poster',
},
'params': {
'skip_download': True,
},
}, {
# no host
'url': 'https://webtools.msvdn.net/embed/MfuWmzL2lGkA',
'only_matching': True,
}, {
'url': 'https://859c1818ed614cc5b0047439470927b0.msvdn.net/amp_embed/tDoFkZD3T1Lw',
'only_matching': True,
}, {
'url': 'https://859c1818ed614cc5b0047439470927b0.msvdn.net/content/tDoFkZD3T1Lw#',
'only_matching': True,
_TESTS = [{
# Live stream offline, has alternative content id
'url': 'https://webtools-e18da6642b684f8aa9ae449862783a56.msvdn.net/embed/53EN6GxbWaJC',
'info_dict': {
'id': '53EN6GxbWaJC',
'title': 'Diretta homepage 2021-12-31 12:00',
'description': '',
'live_status': 'was_live',
'ext': 'mp4',
'thumbnail': r're:https?://[\w-]+\.msvdn\.net/image/\w+/poster',
},
]
'expected_warnings': [
'Ignoring alternative content ID: WDAF1KOWUpH3',
'MainStreaming said: Live event is OFFLINE',
],
'skip': 'live stream offline',
}, {
# playlist
'url': 'https://webtools-e18da6642b684f8aa9ae449862783a56.msvdn.net/embed/WDAF1KOWUpH3',
'info_dict': {
'id': 'WDAF1KOWUpH3',
'title': 'Playlist homepage',
},
'playlist_mincount': 2,
}, {
# livestream
'url': 'https://webtools-859c1818ed614cc5b0047439470927b0.msvdn.net/embed/tDoFkZD3T1Lw',
'info_dict': {
'id': 'tDoFkZD3T1Lw',
'title': str,
'live_status': 'is_live',
'ext': 'mp4',
'thumbnail': r're:https?://[\w-]+\.msvdn\.net/image/\w+/poster',
},
'skip': 'live stream',
}, {
'url': 'https://webtools-f5842579ff984c1c98d63b8d789673eb.msvdn.net/embed/EUlZfGWkGpOd?autoPlay=false',
'info_dict': {
'id': 'EUlZfGWkGpOd',
'title': 'La Settimana ',
'description': '03 Ottobre ore 02:00',
'ext': 'mp4',
'live_status': 'not_live',
'thumbnail': r're:https?://[\w-]+\.msvdn\.net/image/\w+/poster',
'duration': 1512,
},
'skip': 'Invalid URL',
}, {
# video without webtools- prefix
'url': 'https://f5842579ff984c1c98d63b8d789673eb.msvdn.net/embed/MfuWmzL2lGkA?autoplay=false&T=1635860445',
'info_dict': {
'id': 'MfuWmzL2lGkA',
'title': 'TG Mattina',
'description': '06 Ottobre ore 08:00',
'ext': 'mp4',
'live_status': 'not_live',
'thumbnail': r're:https?://[\w-]+\.msvdn\.net/image/\w+/poster',
'duration': 789.04,
},
'skip': 'Invalid URL',
}, {
# always-on livestream with DVR
'url': 'https://webtools-f5842579ff984c1c98d63b8d789673eb.msvdn.net/embed/HVvPMzy',
'info_dict': {
'id': 'HVvPMzy',
'title': str,
'description': 'canale all news',
'live_status': 'is_live',
'ext': 'mp4',
'thumbnail': r're:https?://[\w-]+\.msvdn\.net/image/\w+/poster',
},
'params': {'skip_download': 'm3u8'},
}, {
# no host
'url': 'https://webtools.msvdn.net/embed/MfuWmzL2lGkA',
'only_matching': True,
}, {
'url': 'https://859c1818ed614cc5b0047439470927b0.msvdn.net/amp_embed/tDoFkZD3T1Lw',
'only_matching': True,
}, {
'url': 'https://859c1818ed614cc5b0047439470927b0.msvdn.net/content/tDoFkZD3T1Lw#',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
# FIXME: Embed detection
'url': 'https://www.lacplay.it/video/in-evidenza_728/lac-storie-p-250-i-santi-pietro-e-paolo_77297/',
'info_dict': {
'id': 'u7kiX5DUaHYr',
'ext': 'mp4',
'title': 'I Santi Pietro e Paolo',
'description': 'md5:ff6be24916ba6b9ae990bf5f3df4911e',
'duration': 1700.0,
'thumbnail': r're:https?://.+',
'tags': '06/07/2025',
'live_status': 'not_live',
},
}]
def _playlist_entries(self, host, playlist_content):
for entry in playlist_content:

View File

@@ -1,15 +1,73 @@
import re
from .common import InfoExtractor
from ..utils import (
clean_html,
determine_ext,
extract_attributes,
int_or_none,
mimetype2ext,
parse_iso8601,
parse_resolution,
str_or_none,
url_or_none,
)
from ..utils.traversal import find_elements, traverse_obj
class MedialaanIE(InfoExtractor):
class MedialaanBaseIE(InfoExtractor):
def _extract_from_mychannels_api(self, mychannels_id):
webpage = self._download_webpage(
f'https://mychannels.video/embed/{mychannels_id}', mychannels_id)
brand_config = self._search_json(
r'window\.mychannels\.brand_config\s*=', webpage, 'brand config', mychannels_id)
response = self._download_json(
f'https://api.mychannels.world/v1/embed/video/{mychannels_id}',
mychannels_id, headers={'X-Mychannels-Brand': brand_config['brand']})
formats = []
for stream in traverse_obj(response, (
'streams', lambda _, v: url_or_none(v['url']),
)):
source_url = stream['url']
ext = determine_ext(source_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
source_url, mychannels_id, 'mp4', m3u8_id='hls', fatal=False))
else:
format_id = traverse_obj(stream, ('quality', {str}))
formats.append({
'ext': ext,
'format_id': format_id,
'url': source_url,
**parse_resolution(format_id),
})
return {
'id': mychannels_id,
'formats': formats,
**traverse_obj(response, {
'title': ('title', {clean_html}),
'description': ('description', {clean_html}, filter),
'duration': ('durationMs', {int_or_none(scale=1000)}, {lambda x: x if x >= 0 else None}),
'genres': ('genre', 'title', {str}, filter, all, filter),
'is_live': ('live', {bool}),
'release_timestamp': ('publicationTimestampMs', {int_or_none(scale=1000)}),
'tags': ('tags', ..., 'title', {str}, filter, all, filter),
'thumbnail': ('image', 'baseUrl', {url_or_none}),
}),
**traverse_obj(response, ('channel', {
'channel': ('title', {clean_html}),
'channel_id': ('id', {str_or_none}),
})),
**traverse_obj(response, ('organisation', {
'uploader': ('title', {clean_html}),
'uploader_id': ('id', {str_or_none}),
})),
**traverse_obj(response, ('show', {
'series': ('title', {clean_html}),
'series_id': ('id', {str_or_none}),
})),
}
class MedialaanIE(MedialaanBaseIE):
_VALID_URL = r'''(?x)
https?://
(?:
@@ -32,7 +90,7 @@ class MedialaanIE(InfoExtractor):
tubantia|
volkskrant
)\.nl
)/video/(?:[^/]+/)*[^/?&#]+~p
)/videos?/(?:[^/?#]+/)*[^/?&#]+(?:-|~p)
)
(?P<id>\d+)
'''
@@ -42,18 +100,83 @@ class MedialaanIE(InfoExtractor):
'id': '193993',
'ext': 'mp4',
'title': 'De terugkeer van Ally de Aap en wie vertrekt er nog bij NAC?',
'timestamp': 1611663540,
'upload_date': '20210126',
'description': 'In een nieuwe Gegenpressing video bespreken Yadran Blanco en Dennis Kas het nieuws omrent NAC.',
'duration': 238,
},
'params': {
'skip_download': True,
'channel': 'BN DeStem',
'channel_id': '418',
'genres': ['Sports'],
'release_date': '20210126',
'release_timestamp': 1611663540,
'series': 'Korte Reportage',
'series_id': '972',
'tags': 'count:2',
'thumbnail': r're:https?://images\.mychannels\.video/imgix/.+\.(?:jpe?g|png)',
'uploader': 'BN De Stem',
'uploader_id': '26',
},
}, {
'url': 'https://www.gelderlander.nl/video/kanalen/degelderlander~c320/series/snel-nieuws~s984/noodbevel-in-doetinchem-politie-stuurt-mensen-centrum-uit~p194093',
'only_matching': True,
'info_dict': {
'id': '194093',
'ext': 'mp4',
'title': 'Noodbevel in Doetinchem: politie stuurt mensen centrum uit',
'description': 'md5:77e85b2cb26cfff9dc1fe2b1db524001',
'duration': 44,
'channel': 'De Gelderlander',
'channel_id': '320',
'genres': ['News'],
'release_date': '20210126',
'release_timestamp': 1611690600,
'series': 'Snel Nieuws',
'series_id': '984',
'tags': 'count:1',
'thumbnail': r're:https?://images\.mychannels\.video/imgix/.+\.(?:jpe?g|png)',
'uploader': 'De Gelderlander',
'uploader_id': '25',
},
}, {
'url': 'https://embed.mychannels.video/sdk/production/193993?options=TFTFF_default',
'url': 'https://www.7sur7.be/videos/production/lla-tendance-tiktok-qui-enflamme-lespagne-707650',
'info_dict': {
'id': '707650',
'ext': 'mp4',
'title': 'La tendance TikTok qui enflamme lEspagne',
'description': 'md5:c7ec4cb733190f227fc8935899f533b5',
'duration': 70,
'channel': 'Lifestyle',
'channel_id': '770',
'genres': ['Beauty & Lifestyle'],
'release_date': '20240906',
'release_timestamp': 1725617330,
'series': 'Lifestyle',
'series_id': '1848',
'tags': 'count:1',
'thumbnail': r're:https?://images\.mychannels\.video/imgix/.+\.(?:jpe?g|png)',
'uploader': '7sur7',
'uploader_id': '67',
},
}, {
'url': 'https://mychannels.video/embed/313117',
'info_dict': {
'id': '313117',
'ext': 'mp4',
'title': str,
'description': 'md5:255e2e52f6fe8a57103d06def438f016',
'channel': 'AD',
'channel_id': '238',
'genres': ['News'],
'live_status': 'is_live',
'release_date': '20241225',
'release_timestamp': 1735169425,
'series': 'Nieuws Update',
'series_id': '3337',
'tags': 'count:1',
'thumbnail': r're:https?://images\.mychannels\.video/imgix/.+\.(?:jpe?g|png)',
'uploader': 'AD',
'uploader_id': '1',
},
'params': {'skip_download': 'Livestream'},
}, {
'url': 'https://embed.mychannels.video/sdk/production/193993',
'only_matching': True,
}, {
'url': 'https://embed.mychannels.video/script/production/193993',
@@ -61,51 +184,41 @@ class MedialaanIE(InfoExtractor):
}, {
'url': 'https://embed.mychannels.video/production/193993',
'only_matching': True,
}, {
'url': 'https://mychannels.video/embed/193993',
'only_matching': True,
}, {
'url': 'https://embed.mychannels.video/embed/193993',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.demorgen.be/snelnieuws/tom-waes-promoot-alcoholtesten-op-werchter-ik-ben-de-laatste-persoon-die-met-de-vinger-moet-wijzen~b7457c0d/',
'info_dict': {
'id': '1576607',
'ext': 'mp4',
'title': 'Tom Waes blaastest',
'channel': 'De Morgen',
'channel_id': '352',
'description': 'Tom Waes werkt mee aan een alcoholcampagne op Werchter',
'duration': 62,
'genres': ['News'],
'release_date': '20250705',
'release_timestamp': 1751730795,
'series': 'Nieuwsvideo\'s',
'series_id': '1683',
'tags': 'count:1',
'thumbnail': r're:https?://video-images\.persgroep\.be/aws_generated.+\.jpg',
'uploader': 'De Morgen',
'uploader_id': '17',
},
'params': {'extractor_args': {'generic': {'impersonate': ['chrome']}}},
}]
@classmethod
def _extract_embed_urls(cls, url, webpage):
entries = []
for element in re.findall(r'(<div[^>]+data-mychannels-type="video"[^>]*>)', webpage):
mychannels_id = extract_attributes(element).get('data-mychannels-id')
if mychannels_id:
entries.append('https://mychannels.video/embed/' + mychannels_id)
return entries
yield from traverse_obj(webpage, (
{find_elements(tag='div', attr='data-mychannels-type', value='video', html=True)},
..., {extract_attributes}, 'data-mychannels-id', {str}, filter,
{lambda x: f'https://mychannels.video/embed/{x}'}))
def _real_extract(self, url):
production_id = self._match_id(url)
production = self._download_json(
'https://embed.mychannels.video/sdk/production/' + production_id,
production_id, query={'options': 'UUUU_default'})['productions'][0]
title = production['title']
mychannels_id = self._match_id(url)
formats = []
for source in (production.get('sources') or []):
src = source.get('src')
if not src:
continue
ext = mimetype2ext(source.get('type'))
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
src, production_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
else:
formats.append({
'ext': ext,
'url': src,
})
return {
'id': production_id,
'title': title,
'formats': formats,
'thumbnail': production.get('posterUrl'),
'timestamp': parse_iso8601(production.get('publicationDate'), ' '),
'duration': int_or_none(production.get('duration')) or None,
}
return self._extract_from_mychannels_api(mychannels_id)

View File

@@ -31,10 +31,9 @@ class MegaTVComIE(MegaTVComBaseIE):
IE_NAME = 'megatvcom'
IE_DESC = 'megatv.com videos'
_VALID_URL = r'https?://(?:www\.)?megatv\.com/(?:\d{4}/\d{2}/\d{2}|[^/]+/(?P<id>\d+))/(?P<slug>[^/]+)'
_TESTS = [{
# FIXME: Unable to extract article id
'url': 'https://www.megatv.com/2021/10/23/egkainia-gia-ti-nea-skini-omega-tou-dimotikou-theatrou-peiraia/',
'md5': '6546a1a37fff0dd51c9dce5f490b7d7d',
'info_dict': {
'id': '520979',
'ext': 'mp4',
@@ -43,20 +42,19 @@ class MegaTVComIE(MegaTVComBaseIE):
'timestamp': 1634975747,
'upload_date': '20211023',
'display_id': 'egkainia-gia-ti-nea-skini-omega-tou-dimotikou-theatrou-peiraia',
'thumbnail': 'https://www.megatv.com/wp-content/uploads/2021/10/ΠΕΙΡΑΙΑΣ-1024x450.jpg',
'thumbnail': r're:https?://www\.megatv\.com/wp-content/uploads/.+\.jpg',
},
}, {
'url': 'https://www.megatv.com/tvshows/527800/epeisodio-65-12/',
'md5': 'cba2085d45c1abeb8e7e9b7e1d6c0072',
'info_dict': {
'id': '527800',
'ext': 'mp4',
'title': 'md5:fc322cb51f682eecfe2f54cd5ab3a157',
'title': 'Η Γη της Ελιάς: Επεισόδιο 65 - A\' ΚΥΚΛΟΣ ',
'description': 'md5:b2b7ed3690a78f2a0156eb790fdc00df',
'timestamp': 1636048859,
'upload_date': '20211104',
'display_id': 'epeisodio-65-12',
'thumbnail': 'https://www.megatv.com/wp-content/uploads/2021/11/16-1-1.jpg',
'thumbnail': r're:https?://www\.megatv\.com/wp-content/uploads/.+\.jpg',
},
}]
@@ -104,8 +102,8 @@ class MegaTVComEmbedIE(MegaTVComBaseIE):
IE_DESC = 'megatv.com embedded videos'
_VALID_URL = r'(?:https?:)?//(?:www\.)?megatv\.com/embed/?\?p=(?P<id>\d+)'
_EMBED_REGEX = [rf'''<iframe[^>]+?src=(?P<_q1>["'])(?P<url>{_VALID_URL})(?P=_q1)''']
_TESTS = [{
# FIXME: Unable to extract article id
'url': 'https://www.megatv.com/embed/?p=2020520979',
'md5': '6546a1a37fff0dd51c9dce5f490b7d7d',
'info_dict': {
@@ -119,6 +117,7 @@ class MegaTVComEmbedIE(MegaTVComBaseIE):
'thumbnail': 'https://www.megatv.com/wp-content/uploads/2021/10/ΠΕΙΡΑΙΑΣ-1024x450.jpg',
},
}, {
# FIXME: Unable to extract article id
'url': 'https://www.megatv.com/embed/?p=2020534081',
'md5': '6ac8b3ce4dc6120c802f780a1e6b3812',
'info_dict': {
@@ -132,6 +131,15 @@ class MegaTVComEmbedIE(MegaTVComBaseIE):
'thumbnail': 'https://www.megatv.com/wp-content/uploads/2021/11/Capture-266.jpg',
},
}]
_WEBPAGE_TESTS = [{
# FIXME: Unable to extract article id
'url': 'https://www.in.gr/2021/12/18/greece/apokalypsi-mega-poios-parelave-tin-ereyna-tsiodra-ek-merous-tis-kyvernisis-o-prothypourgos-telika-gnorize/',
'info_dict': {
'id': 'apokalypsi-mega-poios-parelave-tin-ereyna-tsiodra-ek-merous-tis-kyvernisis-o-prothypourgos-telika-gnorize',
'title': 'md5:5e569cf996ec111057c2764ec272848f',
},
'playlist_count': 2,
}]
def _match_canonical_url(self, webpage):
LINK_RE = r'''(?x)

View File

@@ -79,7 +79,7 @@ class MiTeleIE(TelecincoBaseIE):
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_akamai_webpage(url, display_id)
webpage = self._download_webpage(url, display_id)
pre_player = self._search_json(
r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=',
webpage, 'Pre Player', display_id)['prePlayer']

View File

@@ -105,89 +105,85 @@ class MLBIE(MLBBaseIE):
r'<iframe[^>]+?src=(["\'])(?P<url>https?://m(?:lb)?\.mlb\.com/shared/video/embed/embed\.html\?.+?)\1',
r'data-video-link=["\'](?P<url>http://m\.mlb\.com/video/[^"\']+)',
]
_TESTS = [
{
'url': 'https://www.mlb.com/mariners/video/ackleys-spectacular-catch/c-34698933',
'md5': '632358dacfceec06bad823b83d21df2d',
'info_dict': {
'id': '34698933',
'ext': 'mp4',
'title': "Ackley's spectacular catch",
'description': 'md5:7f5a981eb4f3cbc8daf2aeffa2215bf0',
'duration': 66,
'timestamp': 1405995000,
'upload_date': '20140722',
'thumbnail': r're:^https?://.*\.jpg$',
},
_TESTS = [{
'url': 'https://www.mlb.com/mariners/video/ackleys-spectacular-catch/c-34698933',
'info_dict': {
'id': '34698933',
'ext': 'mp4',
'title': 'Ackley\'s spectacular catch',
'description': 'md5:7f5a981eb4f3cbc8daf2aeffa2215bf0',
'duration': 66,
'timestamp': 1405995000,
'upload_date': '20140722',
'thumbnail': r're:https?://.+\.jpg',
},
{
'url': 'https://www.mlb.com/video/stanton-prepares-for-derby/c-34496663',
'md5': 'bf2619bf9cacc0a564fc35e6aeb9219f',
'info_dict': {
'id': '34496663',
'ext': 'mp4',
'title': 'Stanton prepares for Derby',
'description': 'md5:d00ce1e5fd9c9069e9c13ab4faedfa57',
'duration': 46,
'timestamp': 1405120200,
'upload_date': '20140711',
'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
'url': 'https://www.mlb.com/video/stanton-prepares-for-derby/c-34496663',
'info_dict': {
'id': '34496663',
'ext': 'mp4',
'title': 'Stanton prepares for Derby',
'description': 'md5:d00ce1e5fd9c9069e9c13ab4faedfa57',
'duration': 46,
'timestamp': 1405120200,
'upload_date': '20140711',
'thumbnail': r're:https?://.+\.jpg',
},
{
'url': 'https://www.mlb.com/video/cespedes-repeats-as-derby-champ/c-34578115',
'md5': '99bb9176531adc600b90880fb8be9328',
'info_dict': {
'id': '34578115',
'ext': 'mp4',
'title': 'Cespedes repeats as Derby champ',
'description': 'md5:08df253ce265d4cf6fb09f581fafad07',
'duration': 488,
'timestamp': 1405414336,
'upload_date': '20140715',
'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
'url': 'https://www.mlb.com/video/cespedes-repeats-as-derby-champ/c-34578115',
'info_dict': {
'id': '34578115',
'ext': 'mp4',
'title': 'Cespedes repeats as Derby champ',
'description': 'md5:08df253ce265d4cf6fb09f581fafad07',
'duration': 488,
'timestamp': 1405414336,
'upload_date': '20140715',
'thumbnail': r're:https?://.+\.jpg',
},
{
'url': 'https://www.mlb.com/video/bautista-on-home-run-derby/c-34577915',
'md5': 'da8b57a12b060e7663ee1eebd6f330ec',
'info_dict': {
'id': '34577915',
'ext': 'mp4',
'title': 'Bautista on Home Run Derby',
'description': 'md5:b80b34031143d0986dddc64a8839f0fb',
'duration': 52,
'timestamp': 1405405122,
'upload_date': '20140715',
'thumbnail': r're:^https?://.*\.jpg$',
},
}, {
'url': 'https://www.mlb.com/video/bautista-on-home-run-derby/c-34577915',
'info_dict': {
'id': '34577915',
'ext': 'mp4',
'title': 'Bautista on Home Run Derby',
'description': 'md5:b80b34031143d0986dddc64a8839f0fb',
'duration': 52,
'timestamp': 1405405122,
'upload_date': '20140715',
'thumbnail': r're:https?://.+\.jpg',
},
{
'url': 'https://www.mlb.com/video/hargrove-homers-off-caldwell/c-1352023483?tid=67793694',
'only_matching': True,
}, {
'url': 'https://www.mlb.com/video/hargrove-homers-off-caldwell/c-1352023483?tid=67793694',
'only_matching': True,
}, {
'url': 'http://m.mlb.com/shared/video/embed/embed.html?content_id=35692085&topic_id=6479266&width=400&height=224&property=mlb',
'only_matching': True,
}, {
'url': 'http://mlb.mlb.com/shared/video/embed/embed.html?content_id=36599553',
'only_matching': True,
}, {
'url': 'http://mlb.mlb.com/es/video/play.jsp?content_id=36599553',
'only_matching': True,
}, {
'url': 'https://www.mlb.com/cardinals/video/piscottys-great-sliding-catch/c-51175783',
'only_matching': True,
}, {
# From http://m.mlb.com/news/article/118550098/blue-jays-kevin-pillar-goes-spidey-up-the-wall-to-rob-tim-beckham-of-a-homer
'url': 'http://mlb.mlb.com/shared/video/embed/m-internal-embed.html?content_id=75609783&property=mlb&autoplay=true&hashmode=false&siteSection=mlb/multimedia/article_118550098/article_embed&club=mlb',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.mlbdailydish.com/2013/2/25/4028804/mlb-classic-video-vault-open-watch-embed-share',
'info_dict': {
'id': 'mlb-classic-video-vault-open-watch-embed-share',
'title': 'MLB Classic vault is open! Don\'t avert your eyes!',
'age_limit': 0,
'description': 'All the video needed to hold you over until real baseball starts next month.',
'thumbnail': r're:https?://cdn\.vox-cdn\.com/thumbor/.+\.jpg',
},
{
'url': 'http://m.mlb.com/shared/video/embed/embed.html?content_id=35692085&topic_id=6479266&width=400&height=224&property=mlb',
'only_matching': True,
},
{
'url': 'http://mlb.mlb.com/shared/video/embed/embed.html?content_id=36599553',
'only_matching': True,
},
{
'url': 'http://mlb.mlb.com/es/video/play.jsp?content_id=36599553',
'only_matching': True,
},
{
'url': 'https://www.mlb.com/cardinals/video/piscottys-great-sliding-catch/c-51175783',
'only_matching': True,
},
{
# From http://m.mlb.com/news/article/118550098/blue-jays-kevin-pillar-goes-spidey-up-the-wall-to-rob-tim-beckham-of-a-homer
'url': 'http://mlb.mlb.com/shared/video/embed/m-internal-embed.html?content_id=75609783&property=mlb&autoplay=true&hashmode=false&siteSection=mlb/multimedia/article_118550098/article_embed&club=mlb',
'only_matching': True,
},
]
'playlist_count': 3,
}]
_TIMESTAMP_KEY = 'date'
@staticmethod
@@ -215,20 +211,19 @@ class MLBIE(MLBBaseIE):
class MLBVideoIE(MLBBaseIE):
_VALID_URL = r'https?://(?:www\.)?mlb\.com/(?:[^/]+/)*video/(?P<id>[^/?&#]+)'
_TEST = {
_TESTS = [{
'url': 'https://www.mlb.com/mariners/video/ackley-s-spectacular-catch-c34698933',
'md5': '632358dacfceec06bad823b83d21df2d',
'info_dict': {
'id': 'c04a8863-f569-42e6-9f87-992393657614',
'ext': 'mp4',
'title': "Ackley's spectacular catch",
'title': 'Ackley\'s spectacular catch',
'description': 'md5:7f5a981eb4f3cbc8daf2aeffa2215bf0',
'duration': 66,
'timestamp': 1405995000,
'upload_date': '20140722',
'thumbnail': r're:^https?://.+',
'thumbnail': r're:https?://.+',
},
}
}]
_TIMESTAMP_KEY = 'timestamp'
@classmethod

View File

@@ -51,23 +51,7 @@ class MotherlessIE(InfoExtractor):
'skip': '404',
}, {
'url': 'http://motherless.com/g/cosplay/633979F',
'md5': '0b2a43f447a49c3e649c93ad1fafa4a0',
'info_dict': {
'id': '633979F',
'ext': 'mp4',
'title': 'Turtlette',
'categories': ['superheroine heroine superher'],
'upload_date': '20140827',
'uploader_id': 'shade0230',
'thumbnail': r're:https?://.*\.jpg',
'age_limit': 18,
'like_count': int,
'comment_count': int,
'view_count': int,
},
'params': {
'nocheckcertificate': True,
},
'expected_exception': 'ExtractorError',
}, {
'url': 'http://motherless.com/8B4BBC1',
'info_dict': {
@@ -113,8 +97,10 @@ class MotherlessIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
if any(p in webpage for p in (
'<title>404 - MOTHERLESS.COM<',
">The page you're looking for cannot be found.<")):
'<title>404 - MOTHERLESS.COM<',
">The page you're looking for cannot be found.<",
'<div class="error-page',
)):
raise ExtractorError(f'Video {video_id} does not exist', expected=True)
if '>The content you are trying to view is for friends only.' in webpage:
@@ -183,6 +169,9 @@ class MotherlessPaginatedIE(InfoExtractor):
def _correct_path(self, url, item_id):
raise NotImplementedError('This method must be implemented by subclasses')
def _correct_title(self, title, /):
return title.partition(' - Videos')[0] if title else None
def _extract_entries(self, webpage, base):
for mobj in re.finditer(r'href="[^"]*(?P<href>/[A-F0-9]+)"\s+title="(?P<title>[^"]+)',
webpage):
@@ -205,7 +194,7 @@ class MotherlessPaginatedIE(InfoExtractor):
return self.playlist_result(
OnDemandPagedList(get_page, self._PAGE_SIZE), item_id,
remove_end(self._html_extract_title(webpage), ' | MOTHERLESS.COM ™'))
self._correct_title(self._html_extract_title(webpage)))
class MotherlessGroupIE(MotherlessPaginatedIE):
@@ -214,7 +203,7 @@ class MotherlessGroupIE(MotherlessPaginatedIE):
'url': 'http://motherless.com/gv/movie_scenes',
'info_dict': {
'id': 'movie_scenes',
'title': 'Movie Scenes - Videos - Hot and sexy scenes from "regular" movies... Beautiful actresses fully',
'title': 'Movie Scenes',
},
'playlist_mincount': 540,
}, {
@@ -230,7 +219,7 @@ class MotherlessGroupIE(MotherlessPaginatedIE):
'id': 'beautiful_cock',
'title': 'Beautiful Cock',
},
'playlist_mincount': 2040,
'playlist_mincount': 371,
}]
def _correct_path(self, url, item_id):
@@ -245,14 +234,14 @@ class MotherlessGalleryIE(MotherlessPaginatedIE):
'id': '338999F',
'title': 'Random',
},
'playlist_mincount': 171,
'playlist_mincount': 100,
}, {
'url': 'https://motherless.com/GVABD6213',
'info_dict': {
'id': 'ABD6213',
'title': 'Cuties',
},
'playlist_mincount': 2,
'playlist_mincount': 1,
}, {
'url': 'https://motherless.com/GVBCF7622',
'info_dict': {
@@ -266,9 +255,12 @@ class MotherlessGalleryIE(MotherlessPaginatedIE):
'id': '035DE2F',
'title': 'General',
},
'playlist_mincount': 420,
'playlist_mincount': 234,
}]
def _correct_title(self, title, /):
return remove_end(title, ' | MOTHERLESS.COM ™')
def _correct_path(self, url, item_id):
return urllib.parse.urljoin(url, f'/GV{item_id}')
@@ -279,14 +271,14 @@ class MotherlessUploaderIE(MotherlessPaginatedIE):
'url': 'https://motherless.com/u/Mrgo4hrs2023',
'info_dict': {
'id': 'Mrgo4hrs2023',
'title': "Mrgo4hrs2023's Uploads - Videos",
'title': "Mrgo4hrs2023's Uploads",
},
'playlist_mincount': 32,
}, {
'url': 'https://motherless.com/u/Happy_couple?t=v',
'info_dict': {
'id': 'Happy_couple',
'title': "Happy_couple's Uploads - Videos",
'title': "Happy_couple's Uploads",
},
'playlist_mincount': 8,
}]

View File

@@ -1,652 +1,268 @@
import re
import xml.etree.ElementTree
import base64
import json
import time
import urllib.parse
from .common import InfoExtractor
from ..networking import HEADRequest, Request
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
RegexNotFoundError,
find_xpath_attr,
fix_xml_ampersands,
float_or_none,
int_or_none,
join_nonempty,
strip_or_none,
timeconvert,
try_get,
unescapeHTML,
js_to_json,
jwt_decode_hs256,
parse_iso8601,
parse_qs,
update_url,
update_url_query,
url_basename,
xpath_text,
url_or_none,
)
from ..utils.traversal import require, traverse_obj
def _media_xml_tag(tag):
return f'{{http://search.yahoo.com/mrss/}}{tag}'
class MTVServicesInfoExtractor(InfoExtractor):
_MOBILE_TEMPLATE = None
_LANG = None
class MTVServicesBaseIE(InfoExtractor):
_GEO_BYPASS = False
_GEO_COUNTRIES = ['US']
_CACHE_SECTION = 'mtvservices'
_ACCESS_TOKEN_KEY = 'access'
_REFRESH_TOKEN_KEY = 'refresh'
_MEDIA_TOKEN_KEY = 'media'
_token_cache = {}
@staticmethod
def _id_from_uri(uri):
return uri.split(':')[-1]
def _jwt_is_expired(token):
return jwt_decode_hs256(token)['exp'] - time.time() < 120
@staticmethod
def _remove_template_parameter(url):
# Remove the templates, like &device={device}
return re.sub(r'&[^=]*?={.*?}(?=(&|$))', '', url)
def _get_auth_suite_data(config):
return traverse_obj(config, {
'clientId': ('clientId', {str}),
'countryCode': ('countryCode', {str}),
})
def _get_feed_url(self, uri, url=None):
return self._FEED_URL
def _get_thumbnail_url(self, uri, itemdoc):
search_path = '{}/{}'.format(_media_xml_tag('group'), _media_xml_tag('thumbnail'))
thumb_node = itemdoc.find(search_path)
if thumb_node is None:
return None
return thumb_node.get('url') or thumb_node.text or None
def _extract_mobile_video_formats(self, mtvn_id):
webpage_url = self._MOBILE_TEMPLATE % mtvn_id
req = Request(webpage_url)
# Otherwise we get a webpage that would execute some javascript
req.headers['User-Agent'] = 'curl/7'
webpage = self._download_webpage(req, mtvn_id,
'Downloading mobile page')
metrics_url = unescapeHTML(self._search_regex(r'<a href="(http://metrics.+?)"', webpage, 'url'))
req = HEADRequest(metrics_url)
response = self._request_webpage(req, mtvn_id, 'Resolving url')
url = response.url
# Transform the url to get the best quality:
url = re.sub(r'.+pxE=mp4', 'http://mtvnmobile.vo.llnwd.net/kip0/_pxn=0+_pxK=18639+_pxE=mp4', url, count=1)
return [{'url': url, 'ext': 'mp4'}]
def _extract_video_formats(self, mdoc, mtvn_id, video_id):
if re.match(r'.*/(error_country_block\.swf|geoblock\.mp4|copyright_error\.flv(?:\?geo\b.+?)?)$', mdoc.find('.//src').text) is not None:
if mtvn_id is not None and self._MOBILE_TEMPLATE is not None:
self.to_screen('The normal version is not available from your '
'country, trying with the mobile version')
return self._extract_mobile_video_formats(mtvn_id)
raise ExtractorError('This video is not available from your country.',
expected=True)
formats = []
for rendition in mdoc.findall('.//rendition'):
if rendition.get('method') == 'hls':
hls_url = rendition.find('./src').text
formats.extend(self._extract_m3u8_formats(
hls_url, video_id, ext='mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
else:
# fms
try:
_, _, ext = rendition.attrib['type'].partition('/')
rtmp_video_url = rendition.find('./src').text
if 'error_not_available.swf' in rtmp_video_url:
raise ExtractorError(
f'{self.IE_NAME} said: video is not available',
expected=True)
if rtmp_video_url.endswith('siteunavail.png'):
continue
formats.extend([{
'ext': 'flv' if rtmp_video_url.startswith('rtmp') else ext,
'url': rtmp_video_url,
'format_id': join_nonempty(
'rtmp' if rtmp_video_url.startswith('rtmp') else None,
rendition.get('bitrate')),
'width': int(rendition.get('width')),
'height': int(rendition.get('height')),
}])
except (KeyError, TypeError):
raise ExtractorError('Invalid rendition field.')
return formats
def _extract_subtitles(self, mdoc, mtvn_id):
subtitles = {}
for transcript in mdoc.findall('.//transcript'):
if transcript.get('kind') != 'captions':
continue
lang = transcript.get('srclang')
for typographic in transcript.findall('./typographic'):
sub_src = typographic.get('src')
if not sub_src:
continue
ext = typographic.get('format')
if ext == 'cea-608':
ext = 'scc'
subtitles.setdefault(lang, []).append({
'url': str(sub_src),
'ext': ext,
})
return subtitles
def _get_video_info(self, itemdoc, use_hls=True):
uri = itemdoc.find('guid').text
video_id = self._id_from_uri(uri)
self.report_extraction(video_id)
content_el = itemdoc.find('{}/{}'.format(_media_xml_tag('group'), _media_xml_tag('content')))
mediagen_url = self._remove_template_parameter(content_el.attrib['url'])
mediagen_url = mediagen_url.replace('device={device}', '')
if 'acceptMethods' not in mediagen_url:
mediagen_url += '&' if '?' in mediagen_url else '?'
mediagen_url += 'acceptMethods='
mediagen_url += 'hls' if use_hls else 'fms'
mediagen_doc = self._download_xml(
mediagen_url, video_id, 'Downloading video urls', fatal=False)
if not isinstance(mediagen_doc, xml.etree.ElementTree.Element):
return None
item = mediagen_doc.find('./video/item')
if item is not None and item.get('type') == 'text':
message = f'{self.IE_NAME} returned error: '
if item.get('code') is not None:
message += '{} - '.format(item.get('code'))
message += item.text
raise ExtractorError(message, expected=True)
description = strip_or_none(xpath_text(itemdoc, 'description'))
timestamp = timeconvert(xpath_text(itemdoc, 'pubDate'))
title_el = None
if title_el is None:
title_el = find_xpath_attr(
itemdoc, './/{http://search.yahoo.com/mrss/}category',
'scheme', 'urn:mtvn:video_title')
if title_el is None:
title_el = itemdoc.find('.//{http://search.yahoo.com/mrss/}title')
if title_el is None:
title_el = itemdoc.find('.//title')
if title_el.text is None:
title_el = None
title = title_el.text
if title is None:
raise ExtractorError('Could not find video title')
title = title.strip()
series = find_xpath_attr(
itemdoc, './/{http://search.yahoo.com/mrss/}category',
'scheme', 'urn:mtvn:franchise')
season = find_xpath_attr(
itemdoc, './/{http://search.yahoo.com/mrss/}category',
'scheme', 'urn:mtvn:seasonN')
episode = find_xpath_attr(
itemdoc, './/{http://search.yahoo.com/mrss/}category',
'scheme', 'urn:mtvn:episodeN')
series = series.text if series is not None else None
season = season.text if season is not None else None
episode = episode.text if episode is not None else None
if season and episode:
# episode number includes season, so remove it
episode = re.sub(rf'^{season}', '', episode)
# This a short id that's used in the webpage urls
mtvn_id = None
mtvn_id_node = find_xpath_attr(itemdoc, './/{http://search.yahoo.com/mrss/}category',
'scheme', 'urn:mtvn:id')
if mtvn_id_node is not None:
mtvn_id = mtvn_id_node.text
formats = self._extract_video_formats(mediagen_doc, mtvn_id, video_id)
# Some parts of complete video may be missing (e.g. missing Act 3 in
# http://www.southpark.de/alle-episoden/s14e01-sexual-healing)
if not formats:
return None
return {
'title': title,
'formats': formats,
'subtitles': self._extract_subtitles(mediagen_doc, mtvn_id),
'id': video_id,
'thumbnail': self._get_thumbnail_url(uri, itemdoc),
'description': description,
'duration': float_or_none(content_el.attrib.get('duration')),
'timestamp': timestamp,
'series': series,
'season_number': int_or_none(season),
'episode_number': int_or_none(episode),
def _call_auth_api(self, path, config, display_id=None, note=None, data=None, headers=None, query=None):
headers = {
'Accept': 'application/json',
'Client-Description': 'deviceName=Chrome Windows;deviceType=desktop;system=Windows NT 10.0',
'Api-Version': '2025-07-09',
**(headers or {}),
}
if data is not None:
headers['Content-Type'] = 'application/json'
if isinstance(data, dict):
data = json.dumps(data, separators=(',', ':')).encode()
def _get_feed_query(self, uri):
data = {'uri': uri}
if self._LANG:
data['lang'] = self._LANG
return data
return self._download_json(
f'https://auth.mtvnservices.com/{path}', display_id,
note=note or 'Calling authentication API', data=data,
headers=headers, query={**self._get_auth_suite_data(config), **(query or {})})
def _get_videos_info(self, uri, use_hls=True, url=None):
video_id = self._id_from_uri(uri)
feed_url = self._get_feed_url(uri, url)
info_url = update_url_query(feed_url, self._get_feed_query(uri))
return self._get_videos_info_from_url(info_url, video_id, use_hls)
def _get_fresh_access_token(self, config, display_id=None, force_refresh=False):
resource_id = config['resourceId']
# resource_id should already be in _token_cache since _get_media_token is the caller
tokens = self._token_cache[resource_id]
def _get_videos_info_from_url(self, url, video_id, use_hls=True):
idoc = self._download_xml(
url, video_id,
'Downloading info', transform_source=fix_xml_ampersands)
access_token = tokens.get(self._ACCESS_TOKEN_KEY)
if not force_refresh and access_token and not self._jwt_is_expired(access_token):
return access_token
title = xpath_text(idoc, './channel/title')
description = xpath_text(idoc, './channel/description')
if self._REFRESH_TOKEN_KEY not in tokens:
response = self._call_auth_api(
'accessToken', config, display_id, 'Retrieving auth tokens', data=b'')
else:
response = self._call_auth_api(
'accessToken/refresh', config, display_id, 'Refreshing auth tokens',
data={'refreshToken': tokens[self._REFRESH_TOKEN_KEY]},
headers={'Authorization': f'Bearer {access_token}'})
entries = []
for item in idoc.findall('.//item'):
info = self._get_video_info(item, use_hls)
if info:
entries.append(info)
tokens[self._ACCESS_TOKEN_KEY] = response['applicationAccessToken']
tokens[self._REFRESH_TOKEN_KEY] = response['deviceRefreshToken']
self.cache.store(self._CACHE_SECTION, resource_id, tokens)
# TODO: should be multi-video
return self.playlist_result(
entries, playlist_title=title, playlist_description=description)
return tokens[self._ACCESS_TOKEN_KEY]
def _extract_triforce_mgid(self, webpage, data_zone=None, video_id=None):
triforce_feed = self._parse_json(self._search_regex(
r'triforceManifestFeed\s*=\s*({.+?})\s*;\s*\n', webpage,
'triforce feed', default='{}'), video_id, fatal=False)
def _get_media_token(self, video_config, config, display_id=None):
resource_id = config['resourceId']
if resource_id in self._token_cache:
tokens = self._token_cache[resource_id]
else:
tokens = self._token_cache[resource_id] = self.cache.load(self._CACHE_SECTION, resource_id) or {}
data_zone = self._search_regex(
r'data-zone=(["\'])(?P<zone>.+?_lc_promo.*?)\1', webpage,
'data zone', default=data_zone, group='zone')
media_token = tokens.get(self._MEDIA_TOKEN_KEY)
if media_token and not self._jwt_is_expired(media_token):
return media_token
feed_url = try_get(
triforce_feed, lambda x: x['manifest']['zones'][data_zone]['feed'],
str)
if not feed_url:
return
access_token = self._get_fresh_access_token(config, display_id)
if not jwt_decode_hs256(access_token).get('accessMethods'):
# MTVServices uses a custom AdobePass oauth flow which is incompatible with AdobePassIE
mso_id = self.get_param('ap_mso')
if not mso_id:
raise ExtractorError(
'This video is only available for users of participating TV providers. '
'Use --ap-mso to specify Adobe Pass Multiple-system operator Identifier and pass '
'cookies from a browser session where you are signed-in to your provider.', expected=True)
feed = self._download_json(feed_url, video_id, fatal=False)
if not feed:
return
auth_suite_data = json.dumps(
self._get_auth_suite_data(config), separators=(',', ':')).encode()
callback_url = update_url_query(config['callbackURL'], {
'authSuiteData': urllib.parse.quote(base64.b64encode(auth_suite_data).decode()),
'mvpdCode': mso_id,
})
auth_url = self._call_auth_api(
f'mvpd/{mso_id}/login', config, display_id,
'Retrieving provider authentication URL',
query={'callbackUrl': callback_url},
headers={'Authorization': f'Bearer {access_token}'})['authenticationUrl']
res = self._download_webpage_handle(auth_url, display_id, 'Downloading provider auth page')
# XXX: The following "provider-specific code" likely only works if mso_id == Comcast_SSO
# BEGIN provider-specific code
redirect_url = self._search_json(
r'initInterstitialRedirect\(', res[0], 'redirect JSON',
display_id, transform_source=js_to_json)['continue']
urlh = self._request_webpage(redirect_url, display_id, 'Requesting provider redirect page')
authorization_code = parse_qs(urlh.url)['authorizationCode'][-1]
# END provider-specific code
self._call_auth_api(
f'access/mvpd/{mso_id}', config, display_id,
'Submitting authorization code to MTVNServices',
query={'authorizationCode': authorization_code}, data=b'',
headers={'Authorization': f'Bearer {access_token}'})
access_token = self._get_fresh_access_token(config, display_id, force_refresh=True)
return try_get(feed, lambda x: x['result']['data']['id'], str)
tokens[self._MEDIA_TOKEN_KEY] = self._call_auth_api(
'mediaToken', config, display_id, 'Fetching media token', data={
'content': {('id' if k == 'videoId' else k): v for k, v in video_config.items()},
'resourceId': resource_id,
}, headers={'Authorization': f'Bearer {access_token}'})['mediaToken']
@staticmethod
def _extract_child_with_type(parent, t):
for c in parent['children']:
if c.get('type') == t:
return c
self.cache.store(self._CACHE_SECTION, resource_id, tokens)
return tokens[self._MEDIA_TOKEN_KEY]
def _real_extract(self, url):
display_id = self._match_id(url)
def _extract_mgid(self, webpage):
try:
# the url can be http://media.mtvnservices.com/fb/{mgid}.swf
# or http://media.mtvnservices.com/{mgid}
og_url = self._og_search_video_url(webpage)
mgid = url_basename(og_url)
if mgid.endswith('.swf'):
mgid = mgid[:-4]
except RegexNotFoundError:
mgid = None
data = self._download_json(
update_url(url, query=None), display_id,
query={'json': 'true'})
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 404 and not self.suitable(e.cause.response.url):
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
raise
if mgid is None or ':' not in mgid:
mgid = self._search_regex(
[r'data-mgid="(.*?)"', r'swfobject\.embedSWF\(".*?(mgid:.*?)"'],
webpage, 'mgid', default=None)
flex_wrapper = traverse_obj(data, (
'children', lambda _, v: v['type'] == 'MainContainer',
(None, ('children', lambda _, v: v['type'] == 'AviaWrapper')),
'children', lambda _, v: v['type'] == 'FlexWrapper', {dict}, any))
video_detail = traverse_obj(flex_wrapper, (
(None, ('children', lambda _, v: v['type'] == 'AuthSuiteWrapper')),
'children', lambda _, v: v['type'] == 'Player',
'props', 'videoDetail', {dict}, any))
if not video_detail:
video_detail = traverse_obj(data, (
'children', ..., ('handleTVEAuthRedirection', None),
'videoDetail', {dict}, any, {require('video detail')}))
if not mgid:
sm4_embed = self._html_search_meta(
'sm4:video:embed', webpage, 'sm4 embed', default='')
mgid = self._search_regex(
r'embed/(mgid:.+?)["\'&?/]', sm4_embed, 'mgid', default=None)
mgid = video_detail['mgid']
video_id = mgid.rpartition(':')[2]
service_url = traverse_obj(video_detail, ('videoServiceUrl', {url_or_none}, {update_url(query=None)}))
if not service_url:
raise ExtractorError('This content is no longer available', expected=True)
if not mgid:
mgid = self._extract_triforce_mgid(webpage)
headers = {}
if video_detail.get('authRequired'):
# The vast majority of provider-locked content has been moved to Paramount+
# BetIE is the only extractor that is currently known to reach this code path
video_config = traverse_obj(flex_wrapper, (
'children', lambda _, v: v['type'] == 'AuthSuiteWrapper',
'props', 'videoConfig', {dict}, any, {require('video config')}))
config = traverse_obj(data, (
'props', 'authSuiteConfig', {dict}, {require('auth suite config')}))
headers['X-VIA-TVE-MEDIATOKEN'] = self._get_media_token(video_config, config, display_id)
if not mgid:
data = self._parse_json(self._search_regex(
r'__DATA__\s*=\s*({.+?});', webpage, 'data'), None)
main_container = self._extract_child_with_type(data, 'MainContainer')
ab_testing = self._extract_child_with_type(main_container, 'ABTesting')
video_player = self._extract_child_with_type(ab_testing or main_container, 'VideoPlayer')
if video_player:
mgid = try_get(video_player, lambda x: x['props']['media']['video']['config']['uri'])
else:
flex_wrapper = self._extract_child_with_type(ab_testing or main_container, 'FlexWrapper')
auth_suite_wrapper = self._extract_child_with_type(flex_wrapper, 'AuthSuiteWrapper')
player = self._extract_child_with_type(auth_suite_wrapper or flex_wrapper, 'Player')
if player:
mgid = try_get(player, lambda x: x['props']['videoDetail']['mgid'])
stream_info = self._download_json(
service_url, video_id, 'Downloading API JSON', 'Unable to download API JSON',
query={'clientPlatform': 'desktop'}, headers=headers)['stitchedstream']
if not mgid:
raise ExtractorError('Could not extract mgid')
manifest_type = stream_info['manifesttype']
if manifest_type == 'hls':
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
stream_info['source'], video_id, 'mp4', m3u8_id=manifest_type)
elif manifest_type == 'dash':
formats, subtitles = self._extract_mpd_formats_and_subtitles(
stream_info['source'], video_id, mpd_id=manifest_type)
else:
self.raise_no_formats(f'Unsupported manifest type "{manifest_type}"')
formats, subtitles = [], {}
return mgid
def _real_extract(self, url):
title = url_basename(url)
webpage = self._download_webpage(url, title)
mgid = self._extract_mgid(webpage)
return self._get_videos_info(mgid, url=url)
return {
**traverse_obj(video_detail, {
'title': ('title', {str}),
'channel': ('channel', 'name', {str}),
'thumbnails': ('images', ..., {'url': ('url', {url_or_none})}),
'description': (('fullDescription', 'description'), {str}, any),
'series': ('parentEntity', 'title', {str}),
'season_number': ('seasonNumber', {int_or_none}),
'episode_number': ('episodeAiringOrder', {int_or_none}),
'duration': ('duration', 'milliseconds', {float_or_none(scale=1000)}),
'timestamp': ((
('originalPublishDate', {parse_iso8601}),
('publishDate', 'timestamp', {int_or_none})), any),
'release_timestamp': ((
('originalAirDate', {parse_iso8601}),
('airDate', 'timestamp', {int_or_none})), any),
}),
'id': video_id,
'display_id': display_id,
'formats': formats,
'subtitles': subtitles,
}
class MTVServicesEmbeddedIE(MTVServicesInfoExtractor):
IE_NAME = 'mtvservices:embedded'
_VALID_URL = r'https?://media\.mtvnservices\.com/embed/(?P<mgid>.+?)(\?|/|$)'
_EMBED_REGEX = [r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//media\.mtvnservices\.com/embed/.+?)\1']
_TEST = {
# From http://www.thewrap.com/peter-dinklage-sums-up-game-of-thrones-in-45-seconds-video/
'url': 'http://media.mtvnservices.com/embed/mgid:uma:video:mtv.com:1043906/cp~vid%3D1043906%26uri%3Dmgid%3Auma%3Avideo%3Amtv.com%3A1043906',
'md5': 'cb349b21a7897164cede95bd7bf3fbb9',
'info_dict': {
'id': '1043906',
'ext': 'mp4',
'title': 'Peter Dinklage Sums Up \'Game Of Thrones\' In 45 Seconds',
'description': '"Sexy sexy sexy, stabby stabby stabby, beautiful language," says Peter Dinklage as he tries summarizing "Game of Thrones" in under a minute.',
'timestamp': 1400126400,
'upload_date': '20140515',
},
}
def _get_feed_url(self, uri, url=None):
video_id = self._id_from_uri(uri)
config = self._download_json(
f'http://media.mtvnservices.com/pmt/e1/access/index.html?uri={uri}&configtype=edge', video_id)
return self._remove_template_parameter(config['feedWithQueryParams'])
def _real_extract(self, url):
mobj = self._match_valid_url(url)
mgid = mobj.group('mgid')
return self._get_videos_info(mgid)
class MTVIE(MTVServicesInfoExtractor):
class MTVIE(MTVServicesBaseIE):
IE_NAME = 'mtv'
_VALID_URL = r'https?://(?:www\.)?mtv\.com/(?:video-clips|(?:full-)?episodes)/(?P<id>[^/?#.]+)'
_FEED_URL = 'http://www.mtv.com/feeds/mrss/'
_VALID_URL = r'https?://(?:www\.)?mtv\.com/(?:video-clips|episodes)/(?P<id>[\da-z]{6})'
_TESTS = [{
'url': 'http://www.mtv.com/video-clips/vl8qof/unlocking-the-truth-trailer',
'md5': '1edbcdf1e7628e414a8c5dcebca3d32b',
'url': 'https://www.mtv.com/video-clips/syolsj',
'info_dict': {
'id': '5e14040d-18a4-47c4-a582-43ff602de88e',
'id': '213ea7f8-bac7-4a43-8cd5-8d8cb8c8160f',
'ext': 'mp4',
'title': 'Unlocking The Truth|July 18, 2016|1|101|Trailer',
'description': '"Unlocking the Truth" premieres August 17th at 11/10c.',
'timestamp': 1468846800,
'upload_date': '20160718',
'display_id': 'syolsj',
'title': 'The Challenge: Vets & New Threats',
'description': 'md5:c4d2e90a5fff6463740fbf96b2bb6a41',
'duration': 95.0,
'thumbnail': r're:https://images\.paramount\.tech/uri/mgid:arc:imageassetref',
'series': 'The Challenge',
'season': 'Season 41',
'season_number': 41,
'episode': 'Episode 0',
'episode_number': 0,
'timestamp': 1753945200,
'upload_date': '20250731',
'release_timestamp': 1753945200,
'release_date': '20250731',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'http://www.mtv.com/full-episodes/94tujl/unlocking-the-truth-gates-of-hell-season-1-ep-101',
'only_matching': True,
}, {
'url': 'http://www.mtv.com/episodes/g8xu7q/teen-mom-2-breaking-the-wall-season-7-ep-713',
'only_matching': True,
'url': 'https://www.mtv.com/episodes/uzvigh',
'info_dict': {
'id': '364e8b9e-e415-11ef-b405-16fff45bc035',
'ext': 'mp4',
'display_id': 'uzvigh',
'title': 'CT Tamburello and Johnny Bananas',
'description': 'md5:364cea52001e9c13f92784e3365c6606',
'channel': 'MTV',
'duration': 1260.0,
'thumbnail': r're:https://images\.paramount\.tech/uri/mgid:arc:imageassetref',
'series': 'Ridiculousness',
'season': 'Season 47',
'season_number': 47,
'episode': 'Episode 19',
'episode_number': 19,
'timestamp': 1753318800,
'upload_date': '20250724',
'release_timestamp': 1753318800,
'release_date': '20250724',
},
'params': {'skip_download': 'm3u8'},
}]
class MTVJapanIE(MTVServicesInfoExtractor):
IE_NAME = 'mtvjapan'
_VALID_URL = r'https?://(?:www\.)?mtvjapan\.com/videos/(?P<id>[0-9a-z]+)'
_TEST = {
'url': 'http://www.mtvjapan.com/videos/prayht/fresh-info-cadillac-escalade',
'info_dict': {
'id': 'bc01da03-6fe5-4284-8880-f291f4e368f5',
'ext': 'mp4',
'title': '【Fresh Info】Cadillac ESCALADE Sport Edition',
},
'params': {
'skip_download': True,
},
}
_GEO_COUNTRIES = ['JP']
_FEED_URL = 'http://feeds.mtvnservices.com/od/feed/intl-mrss-player-feed'
def _get_feed_query(self, uri):
return {
'arcEp': 'mtvjapan.com',
'mgid': uri,
}
class MTVVideoIE(MTVServicesInfoExtractor):
IE_NAME = 'mtv:video'
_VALID_URL = r'''(?x)^https?://
(?:(?:www\.)?mtv\.com/videos/.+?/(?P<videoid>[0-9]+)/[^/]+$|
m\.mtv\.com/videos/video\.rbml\?.*?id=(?P<mgid>[^&]+))'''
_FEED_URL = 'http://www.mtv.com/player/embed/AS3/rss/'
_TESTS = [
{
'url': 'http://www.mtv.com/videos/misc/853555/ours-vh1-storytellers.jhtml',
'md5': '850f3f143316b1e71fa56a4edfd6e0f8',
'info_dict': {
'id': '853555',
'ext': 'mp4',
'title': 'Taylor Swift - "Ours (VH1 Storytellers)"',
'description': 'Album: Taylor Swift performs "Ours" for VH1 Storytellers at Harvey Mudd College.',
'timestamp': 1352610000,
'upload_date': '20121111',
},
},
]
def _get_thumbnail_url(self, uri, itemdoc):
return 'http://mtv.mtvnimages.com/uri/' + uri
def _real_extract(self, url):
mobj = self._match_valid_url(url)
video_id = mobj.group('videoid')
uri = mobj.groupdict().get('mgid')
if uri is None:
webpage = self._download_webpage(url, video_id)
# Some videos come from Vevo.com
m_vevo = re.search(
r'(?s)isVevoVideo = true;.*?vevoVideoId = "(.*?)";', webpage)
if m_vevo:
vevo_id = m_vevo.group(1)
self.to_screen(f'Vevo video detected: {vevo_id}')
return self.url_result(f'vevo:{vevo_id}', ie='Vevo')
uri = self._html_search_regex(r'/uri/(.*?)\?', webpage, 'uri')
return self._get_videos_info(uri)
class MTVDEIE(MTVServicesInfoExtractor):
_WORKING = False
IE_NAME = 'mtv.de'
_VALID_URL = r'https?://(?:www\.)?mtv\.de/(?:musik/videoclips|folgen|news)/(?P<id>[0-9a-z]+)'
_TESTS = [{
'url': 'http://www.mtv.de/musik/videoclips/2gpnv7/Traum',
'info_dict': {
'id': 'd5d472bc-f5b7-11e5-bffd-a4badb20dab5',
'ext': 'mp4',
'title': 'Traum',
'description': 'Traum',
},
'params': {
# rtmp download
'skip_download': True,
},
'skip': 'Blocked at Travis CI',
}, {
# mediagen URL without query (e.g. http://videos.mtvnn.com/mediagen/e865da714c166d18d6f80893195fcb97)
'url': 'http://www.mtv.de/folgen/6b1ylu/teen-mom-2-enthuellungen-S5-F1',
'info_dict': {
'id': '1e5a878b-31c5-11e7-a442-0e40cf2fc285',
'ext': 'mp4',
'title': 'Teen Mom 2',
'description': 'md5:dc65e357ef7e1085ed53e9e9d83146a7',
},
'params': {
# rtmp download
'skip_download': True,
},
'skip': 'Blocked at Travis CI',
}, {
'url': 'http://www.mtv.de/news/glolix/77491-mtv-movies-spotlight--pixels--teil-3',
'info_dict': {
'id': 'local_playlist-4e760566473c4c8c5344',
'ext': 'mp4',
'title': 'Article_mtv-movies-spotlight-pixels-teil-3_short-clips_part1',
'description': 'MTV Movies Supercut',
},
'params': {
# rtmp download
'skip_download': True,
},
'skip': 'Das Video kann zur Zeit nicht abgespielt werden.',
}]
_GEO_COUNTRIES = ['DE']
_FEED_URL = 'http://feeds.mtvnservices.com/od/feed/intl-mrss-player-feed'
def _get_feed_query(self, uri):
return {
'arcEp': 'mtv.de',
'mgid': uri,
}
class MTVItaliaIE(MTVServicesInfoExtractor):
IE_NAME = 'mtv.it'
_VALID_URL = r'https?://(?:www\.)?mtv\.it/(?:episodi|video|musica)/(?P<id>[0-9a-z]+)'
_TESTS = [{
'url': 'http://www.mtv.it/episodi/24bqab/mario-una-serie-di-maccio-capatonda-cavoli-amario-episodio-completo-S1-E1',
'info_dict': {
'id': '0f0fc78e-45fc-4cce-8f24-971c25477530',
'ext': 'mp4',
'title': 'Cavoli amario (episodio completo)',
'description': 'md5:4962bccea8fed5b7c03b295ae1340660',
'series': 'Mario - Una Serie Di Maccio Capatonda',
'season_number': 1,
'episode_number': 1,
},
'params': {
'skip_download': True,
},
}]
_GEO_COUNTRIES = ['IT']
_FEED_URL = 'http://feeds.mtvnservices.com/od/feed/intl-mrss-player-feed'
def _get_feed_query(self, uri):
return {
'arcEp': 'mtv.it',
'mgid': uri,
}
class MTVItaliaProgrammaIE(MTVItaliaIE): # XXX: Do not subclass from concrete IE
IE_NAME = 'mtv.it:programma'
_VALID_URL = r'https?://(?:www\.)?mtv\.it/(?:programmi|playlist)/(?P<id>[0-9a-z]+)'
_TESTS = [{
# program page: general
'url': 'http://www.mtv.it/programmi/s2rppv/mario-una-serie-di-maccio-capatonda',
'info_dict': {
'id': 'a6f155bc-8220-4640-aa43-9b95f64ffa3d',
'title': 'Mario - Una Serie Di Maccio Capatonda',
'description': 'md5:72fbffe1f77ccf4e90757dd4e3216153',
},
'playlist_count': 2,
'params': {
'skip_download': True,
},
}, {
# program page: specific season
'url': 'http://www.mtv.it/programmi/d9ncjf/mario-una-serie-di-maccio-capatonda-S2',
'info_dict': {
'id': '4deeb5d8-f272-490c-bde2-ff8d261c6dd1',
'title': 'Mario - Una Serie Di Maccio Capatonda - Stagione 2',
},
'playlist_count': 34,
'params': {
'skip_download': True,
},
}, {
# playlist page + redirect
'url': 'http://www.mtv.it/playlist/sexy-videos/ilctal',
'info_dict': {
'id': 'dee8f9ee-756d-493b-bf37-16d1d2783359',
'title': 'Sexy Videos',
},
'playlist_mincount': 145,
'params': {
'skip_download': True,
},
}]
_GEO_COUNTRIES = ['IT']
_FEED_URL = 'http://www.mtv.it/feeds/triforce/manifest/v8'
def _get_entries(self, title, url):
while True:
pg = self._search_regex(r'/(\d+)$', url, 'entries', '1')
entries = self._download_json(url, title, f'page {pg}')
url = try_get(
entries, lambda x: x['result']['nextPageURL'], str)
entries = try_get(
entries, (
lambda x: x['result']['data']['items'],
lambda x: x['result']['data']['seasons']),
list)
for entry in entries or []:
if entry.get('canonicalURL'):
yield self.url_result(entry['canonicalURL'])
if not url:
break
def _real_extract(self, url):
query = {'url': url}
info_url = update_url_query(self._FEED_URL, query)
video_id = self._match_id(url)
info = self._download_json(info_url, video_id).get('manifest')
redirect = try_get(
info, lambda x: x['newLocation']['url'], str)
if redirect:
return self.url_result(redirect)
title = info.get('title')
video_id = try_get(
info, lambda x: x['reporting']['itemId'], str)
parent_id = try_get(
info, lambda x: x['reporting']['parentId'], str)
playlist_url = current_url = None
for z in (info.get('zones') or {}).values():
if z.get('moduleName') in ('INTL_M304', 'INTL_M209'):
info_url = z.get('feed')
if z.get('moduleName') in ('INTL_M308', 'INTL_M317'):
playlist_url = playlist_url or z.get('feed')
if z.get('moduleName') in ('INTL_M300',):
current_url = current_url or z.get('feed')
if not info_url:
raise ExtractorError('No info found')
if video_id == parent_id:
video_id = self._search_regex(
r'([^\/]+)/[^\/]+$', info_url, 'video_id')
info = self._download_json(info_url, video_id, 'Show infos')
info = try_get(info, lambda x: x['result']['data'], dict)
title = title or try_get(
info, (
lambda x: x['title'],
lambda x: x['headline']),
str)
description = try_get(info, lambda x: x['content'], str)
if current_url:
season = try_get(
self._download_json(playlist_url, video_id, 'Seasons info'),
lambda x: x['result']['data'], dict)
current = try_get(
season, lambda x: x['currentSeason'], str)
seasons = try_get(
season, lambda x: x['seasons'], list) or []
if current in [s.get('eTitle') for s in seasons]:
playlist_url = current_url
title = re.sub(
r'[-|]\s*(?:mtv\s*italia|programma|playlist)',
'', title, flags=re.IGNORECASE).strip()
return self.playlist_result(
self._get_entries(title, playlist_url),
video_id, title, description)

View File

@@ -111,12 +111,8 @@ class MySpaceIE(InfoExtractor):
search_data('stream-url'), search_data('hls-stream-url'),
search_data('http-stream-url'))
if not formats:
vevo_id = search_data('vevo-id')
youtube_id = search_data('youtube-id')
if vevo_id:
self.to_screen(f'Vevo video detected: {vevo_id}')
return self.url_result(f'vevo:{vevo_id}', ie='Vevo')
elif youtube_id:
if youtube_id:
self.to_screen(f'Youtube video detected: {youtube_id}')
return self.url_result(youtube_id, ie='Youtube')
else:

View File

@@ -1,4 +1,5 @@
import re
import urllib.parse
from .common import InfoExtractor
from ..utils import (
@@ -38,7 +39,7 @@ class N1InfoIIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:\w+\.)?n1info\.\w+|nova\.rs)/(?:[^/?#]+/){1,2}(?P<id>[^/?#]+)'
_TESTS = [{
# YouTube embedded
'url': 'https://rs.n1info.com/sport-klub/tenis/kako-je-djokovic-propustio-istorijsku-priliku-video/',
'url': 'https://sportklub.n1info.rs/tenis/us-open/glava-telo-igra-kako-je-novak-ispustio-istorijsku-sansu/',
'md5': '987ce6fd72acfecc453281e066b87973',
'info_dict': {
'id': 'L5Hd4hQVUpk',
@@ -67,36 +68,24 @@ class N1InfoIIE(InfoExtractor):
'playable_in_embed': True,
'availability': 'public',
'live_status': 'not_live',
'media_type': 'video',
},
}, {
'url': 'https://rs.n1info.com/vesti/djilas-los-plan-za-metro-nece-resiti-nijedan-saobracajni-problem/',
'url': 'https://n1info.si/novice/svet/v-srbiji-samo-ta-konec-tedna-vec-kot-200-pozarov/',
'info_dict': {
'id': 'bgmetrosot2409zta20210924174316682-n1info-rs-worldwide',
'id': '2182656',
'ext': 'mp4',
'title': 'Đilas: Predlog izgradnje metroa besmislen; SNS odbacuje navode',
'upload_date': '20210924',
'timestamp': 1632481347,
'thumbnail': 'http://n1info.rs/wp-content/themes/ucnewsportal-n1/dist/assets/images/placeholder-image-video.jpg',
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://n1info.si/novice/slovenija/zadnji-dnevi-na-kopaliscu-ilirija-ilirija-ni-umrla-ubili-so-jo/',
'info_dict': {
'id': 'ljsottomazilirija3060921-n1info-si-worldwide',
'ext': 'mp4',
'title': 'Zadnji dnevi na kopališču Ilirija: “Ilirija ni umrla, ubili so jo”',
'timestamp': 1632567630,
'upload_date': '20210925',
'thumbnail': 'https://n1info.si/wp-content/uploads/2021/09/06/1630945843-tomaz3.png',
'title': 'V Srbiji samo ta konec tedna več kot 200 požarov',
'timestamp': 1753611983,
'upload_date': '20250727',
'thumbnail': 'https://n1info.si/media/images/2025/7/1753611048_Pozar.width-1200.webp',
},
'params': {
'skip_download': True,
},
}, {
# Reddit embedded
'url': 'https://ba.n1info.com/lifestyle/vucic-bolji-od-tita-ako-izgubi-ja-cu-da-crknem-jugoslavija-je-gotova/',
'url': 'https://nova.rs/vesti/drustvo/ako-vucic-izgubi-izbore-ja-cu-da-crknem-jugoslavija-je-gotova/',
'info_dict': {
'id': '2wmfee9eycp71',
'ext': 'mp4',
@@ -113,9 +102,6 @@ class N1InfoIIE(InfoExtractor):
'duration': 134,
'thumbnail': 'https://external-preview.redd.it/5nmmawSeGx60miQM3Iq-ueC9oyCLTLjjqX-qqY8uRsc.png?format=pjpg&auto=webp&s=2f973400b04d23f871b608b178e47fc01f9b8f1d',
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://nova.rs/vesti/politika/zaklina-tatalovic-ani-brnabic-pricate-lazi-video/',
'info_dict': {
@@ -126,6 +112,9 @@ class N1InfoIIE(InfoExtractor):
'timestamp': 1635861677,
'thumbnail': 'https://nova.rs/wp-content/uploads/2021/11/02/1635860298-TNJG_Ana_Brnabic_i_Zaklina_Tatalovic_100_dana_Vlade_GP.jpg',
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://n1info.rs/vesti/cuta-biti-u-kosovskoj-mitrovici-znaci-da-te-docekaju-eksplozivnim-napravama/',
'info_dict': {
@@ -155,12 +144,17 @@ class N1InfoIIE(InfoExtractor):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(r'<h1[^>]+>(.+?)</h1>', webpage, 'title')
timestamp = unified_timestamp(self._html_search_meta('article:published_time', webpage))
title = self._og_search_title(webpage) or self._html_extract_title(webpage)
timestamp = unified_timestamp(
self._og_search_property('published_time', webpage, default=None)
or self._html_search_meta('article:published_time', webpage))
plugin_data = re.findall(r'\$bp\("(?:Brid|TargetVideo)_\d+",\s(.+)\);', webpage)
entries = []
if plugin_data:
site_id = self._html_search_regex(r'site:(\d+)', webpage, 'site id')
site_id = self._html_search_regex(r'site:(\d+)', webpage, 'site id', default=None)
if site_id is None:
site_id = self._search_regex(
r'partners/(\d+)', self._html_search_meta('contentUrl', webpage, fatal=True), 'site ID')
for video_data in plugin_data:
video_id = self._parse_json(video_data, title)['video']
entries.append({
@@ -191,10 +185,13 @@ class N1InfoIIE(InfoExtractor):
for embedded_video in embedded_videos:
video_data = extract_attributes(embedded_video)
url = video_data.get('src') or ''
if url.startswith('https://www.youtube.com'):
hostname = urllib.parse.urlparse(url).hostname
if hostname == 'www.youtube.com':
entries.append(self.url_result(url, ie='Youtube'))
elif url.startswith('https://www.redditmedia.com'):
elif hostname == 'www.redditmedia.com':
entries.append(self.url_result(url, ie='Reddit'))
elif hostname == 'www.facebook.com' and 'plugins/video' in url:
entries.append(self.url_result(url, ie='FacebookPluginsVideo'))
return {
'_type': 'playlist',

View File

@@ -138,95 +138,88 @@ class NBCUniversalBaseIE(ThePlatformBaseIE):
class NBCIE(NBCUniversalBaseIE):
_VALID_URL = r'https?(?P<permalink>://(?:www\.)?nbc\.com/(?:classic-tv/)?[^/?#]+/video/[^/?#]+/(?P<id>\w+))'
_TESTS = [
{
'url': 'http://www.nbc.com/the-tonight-show/video/jimmy-fallon-surprises-fans-at-ben-jerrys/2848237',
'info_dict': {
'id': '2848237',
'ext': 'mp4',
'title': 'Jimmy Fallon Surprises Fans at Ben & Jerry\'s',
'description': 'Jimmy gives out free scoops of his new "Tonight Dough" ice cream flavor by surprising customers at the Ben & Jerry\'s scoop shop.',
'timestamp': 1424246400,
'upload_date': '20150218',
'uploader': 'NBCU-COM',
'episode': 'Jimmy Fallon Surprises Fans at Ben & Jerry\'s',
'episode_number': 86,
'season': 'Season 2',
'season_number': 2,
'series': 'Tonight',
'duration': 236.504,
'tags': 'count:2',
'thumbnail': r're:https?://.+\.jpg',
'categories': ['Series/The Tonight Show Starring Jimmy Fallon'],
'media_type': 'Full Episode',
'age_limit': 14,
'_old_archive_ids': ['theplatform 2848237'],
},
'params': {
'skip_download': 'm3u8',
},
_TESTS = [{
'url': 'http://www.nbc.com/the-tonight-show/video/jimmy-fallon-surprises-fans-at-ben-jerrys/2848237',
'info_dict': {
'id': '2848237',
'ext': 'mp4',
'title': 'Jimmy Fallon Surprises Fans at Ben & Jerry\'s',
'description': 'Jimmy gives out free scoops of his new "Tonight Dough" ice cream flavor by surprising customers at the Ben & Jerry\'s scoop shop.',
'timestamp': 1424246400,
'upload_date': '20150218',
'uploader': 'NBCU-COM',
'episode': 'Jimmy Fallon Surprises Fans at Ben & Jerry\'s',
'episode_number': 86,
'season': 'Season 2',
'season_number': 2,
'series': 'Tonight',
'duration': 236.504,
'tags': 'count:2',
'thumbnail': r're:https?://.+\.jpg',
'categories': ['Series/The Tonight Show Starring Jimmy Fallon'],
'media_type': 'Full Episode',
'age_limit': 14,
'_old_archive_ids': ['theplatform 2848237'],
},
{
'url': 'https://www.nbc.com/the-golden-globe-awards/video/oprah-winfrey-receives-cecil-b-de-mille-award-at-the-2018-golden-globes/3646439',
'info_dict': {
'id': '3646439',
'ext': 'mp4',
'title': 'Oprah Winfrey Receives Cecil B. de Mille Award at the 2018 Golden Globes',
'episode': 'Oprah Winfrey Receives Cecil B. de Mille Award at the 2018 Golden Globes',
'episode_number': 1,
'season': 'Season 75',
'season_number': 75,
'series': 'Golden Globes',
'description': 'Oprah Winfrey receives the Cecil B. de Mille Award at the 75th Annual Golden Globe Awards.',
'uploader': 'NBCU-COM',
'upload_date': '20180107',
'timestamp': 1515312000,
'duration': 569.703,
'tags': 'count:8',
'thumbnail': r're:https?://.+\.jpg',
'media_type': 'Highlight',
'age_limit': 0,
'categories': ['Series/The Golden Globe Awards'],
'_old_archive_ids': ['theplatform 3646439'],
},
'params': {
'skip_download': 'm3u8',
},
'params': {
'skip_download': 'm3u8',
},
{
# Needs to be extracted from webpage instead of GraphQL
'url': 'https://www.nbc.com/paris2024/video/ali-truwit-found-purpose-pool-after-her-life-changed/para24_sww_alitruwittodayshow_240823',
'info_dict': {
'id': 'para24_sww_alitruwittodayshow_240823',
'ext': 'mp4',
'title': 'Ali Truwit found purpose in the pool after her life changed',
'description': 'md5:c16d7489e1516593de1cc5d3f39b9bdb',
'uploader': 'NBCU-SPORTS',
'duration': 311.077,
'thumbnail': r're:https?://.+\.jpg',
'episode': 'Ali Truwit found purpose in the pool after her life changed',
'timestamp': 1724435902.0,
'upload_date': '20240823',
'_old_archive_ids': ['theplatform para24_sww_alitruwittodayshow_240823'],
},
'params': {
'skip_download': 'm3u8',
},
}, {
'url': 'https://www.nbc.com/the-golden-globe-awards/video/oprah-winfrey-receives-cecil-b-de-mille-award-at-the-2018-golden-globes/3646439',
'info_dict': {
'id': '3646439',
'ext': 'mp4',
'title': 'Oprah Winfrey Receives Cecil B. de Mille Award at the 2018 Golden Globes',
'episode': 'Oprah Winfrey Receives Cecil B. de Mille Award at the 2018 Golden Globes',
'episode_number': 1,
'season': 'Season 75',
'season_number': 75,
'series': 'Golden Globes',
'description': 'Oprah Winfrey receives the Cecil B. de Mille Award at the 75th Annual Golden Globe Awards.',
'uploader': 'NBCU-COM',
'upload_date': '20180107',
'timestamp': 1515312000,
'duration': 569.703,
'tags': 'count:8',
'thumbnail': r're:https?://.+\.jpg',
'media_type': 'Highlight',
'age_limit': 0,
'categories': ['Series/The Golden Globe Awards'],
'_old_archive_ids': ['theplatform 3646439'],
},
{
'url': 'https://www.nbc.com/quantum-leap/video/bens-first-leap-nbcs-quantum-leap/NBCE125189978',
'only_matching': True,
'params': {
'skip_download': 'm3u8',
},
{
'url': 'https://www.nbc.com/classic-tv/charles-in-charge/video/charles-in-charge-pilot/n3310',
'only_matching': True,
}, {
# Needs to be extracted from webpage instead of GraphQL
'url': 'https://www.nbc.com/paris2024/video/ali-truwit-found-purpose-pool-after-her-life-changed/para24_sww_alitruwittodayshow_240823',
'info_dict': {
'id': 'para24_sww_alitruwittodayshow_240823',
'ext': 'mp4',
'title': 'Ali Truwit found purpose in the pool after her life changed',
'description': 'md5:c16d7489e1516593de1cc5d3f39b9bdb',
'uploader': 'NBCU-SPORTS',
'duration': 311.077,
'thumbnail': r're:https?://.+\.jpg',
'episode': 'Ali Truwit found purpose in the pool after her life changed',
'timestamp': 1724435902.0,
'upload_date': '20240823',
'_old_archive_ids': ['theplatform para24_sww_alitruwittodayshow_240823'],
},
{
# Percent escaped url
'url': 'https://www.nbc.com/up-all-night/video/day-after-valentine%27s-day/n2189',
'only_matching': True,
'params': {
'skip_download': 'm3u8',
},
]
}, {
'url': 'https://www.nbc.com/quantum-leap/video/bens-first-leap-nbcs-quantum-leap/NBCE125189978',
'only_matching': True,
}, {
'url': 'https://www.nbc.com/classic-tv/charles-in-charge/video/charles-in-charge-pilot/n3310',
'only_matching': True,
}, {
# Percent escaped url
'url': 'https://www.nbc.com/up-all-night/video/day-after-valentine%27s-day/n2189',
'only_matching': True,
}]
_SOFTWARE_STATEMENT = 'eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiI1Yzg2YjdkYy04NDI3LTRjNDUtOGQwZi1iNDkzYmE3MmQwYjQiLCJuYmYiOjE1Nzg3MDM2MzEsImlzcyI6ImF1dGguYWRvYmUuY29tIiwiaWF0IjoxNTc4NzAzNjMxfQ.QQKIsBhAjGQTMdAqRTqhcz2Cddr4Y2hEjnSiOeKKki4nLrkDOsjQMmqeTR0hSRarraxH54wBgLvsxI7LHwKMvr7G8QpynNAxylHlQD3yhN9tFhxt4KR5wW3as02B-W2TznK9bhNWPKIyHND95Uo2Mi6rEQoq8tM9O09WPWaanE5BX_-r6Llr6dPq5F0Lpx2QOn2xYRb1T4nFxdFTNoss8GBds8OvChTiKpXMLHegLTc1OS4H_1a8tO_37jDwSdJuZ8iTyRLV4kZ2cpL6OL5JPMObD4-HQiec_dfcYgMKPiIfP9ZqdXpec2SVaCLsWEk86ZYvD97hLIQrK5rrKd1y-A'
def _real_extract(self, url):
@@ -378,6 +371,15 @@ class NBCSportsIE(InfoExtractor):
'url': 'https://www.nbcsports.com/boston/video/report-card-pats-secondary-no-match-josh-allen',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'http://www.riderfans.com/forum/showthread.php?121827-Freeman&s=e98fa1ea6dc08e886b1678d35212494a',
'info_dict': {
'id': 'ln7x1qSThw4k',
'ext': 'flv',
'title': "PFT Live: New leader in the 'new-look' defense",
},
'skip': 'Invalid URL',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -389,7 +391,7 @@ class NBCSportsIE(InfoExtractor):
class NBCSportsStreamIE(AdobePassIE):
_WORKING = False
_VALID_URL = r'https?://stream\.nbcsports\.com/.+?\bpid=(?P<id>\d+)'
_TEST = {
_TESTS = [{
'url': 'http://stream.nbcsports.com/nbcsn/generic?pid=206559',
'info_dict': {
'id': '206559',
@@ -402,7 +404,7 @@ class NBCSportsStreamIE(AdobePassIE):
'skip_download': True,
},
'skip': 'Requires Adobe Pass Authentication',
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -449,98 +451,100 @@ class NBCNewsIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
_VALID_URL = r'(?x)https?://(?:www\.)?(?:nbcnews|today|msnbc)\.com/([^/]+/)*(?:.*-)?(?P<id>[^/?]+)'
_EMBED_REGEX = [r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//www\.nbcnews\.com/widget/video-embed/[^"\']+)\1']
_TESTS = [
{
'url': 'http://www.nbcnews.com/watch/nbcnews-com/how-twitter-reacted-to-the-snowden-interview-269389891880',
'md5': 'fb3dcd2d7b1dd9804305fa2fc95ab610', # md5 tends to fluctuate
'info_dict': {
'id': '269389891880',
'ext': 'mp4',
'title': 'How Twitter Reacted To The Snowden Interview',
'description': 'md5:65a0bd5d76fe114f3c2727aa3a81fe64',
'timestamp': 1401363060,
'upload_date': '20140529',
'duration': 46.0,
'thumbnail': 'https://media-cldnry.s-nbcnews.com/image/upload/MSNBC/Components/Video/140529/p_tweet_snow_140529.jpg',
},
_TESTS = [{
'url': 'http://www.nbcnews.com/watch/nbcnews-com/how-twitter-reacted-to-the-snowden-interview-269389891880',
'md5': 'fb3dcd2d7b1dd9804305fa2fc95ab610', # md5 tends to fluctuate
'info_dict': {
'id': '269389891880',
'ext': 'mp4',
'title': 'How Twitter Reacted To The Snowden Interview',
'description': 'md5:65a0bd5d76fe114f3c2727aa3a81fe64',
'timestamp': 1401363060,
'upload_date': '20140529',
'duration': 46.0,
'thumbnail': 'https://media-cldnry.s-nbcnews.com/image/upload/MSNBC/Components/Video/140529/p_tweet_snow_140529.jpg',
},
{
'url': 'http://www.nbcnews.com/feature/dateline-full-episodes/full-episode-family-business-n285156',
'md5': 'fdbf39ab73a72df5896b6234ff98518a',
'info_dict': {
'id': '529953347624',
'ext': 'mp4',
'title': 'FULL EPISODE: Family Business',
'description': 'md5:757988edbaae9d7be1d585eb5d55cc04',
},
'skip': 'This page is unavailable.',
}, {
'url': 'http://www.nbcnews.com/feature/dateline-full-episodes/full-episode-family-business-n285156',
'md5': 'fdbf39ab73a72df5896b6234ff98518a',
'info_dict': {
'id': '529953347624',
'ext': 'mp4',
'title': 'FULL EPISODE: Family Business',
'description': 'md5:757988edbaae9d7be1d585eb5d55cc04',
},
{
'url': 'http://www.nbcnews.com/nightly-news/video/nightly-news-with-brian-williams-full-broadcast-february-4-394064451844',
'md5': '40d0e48c68896359c80372306ece0fc3',
'info_dict': {
'id': '394064451844',
'ext': 'mp4',
'title': 'Nightly News with Brian Williams Full Broadcast (February 4)',
'description': 'md5:1c10c1eccbe84a26e5debb4381e2d3c5',
'timestamp': 1423104900,
'upload_date': '20150205',
'duration': 1236.0,
'thumbnail': 'https://media-cldnry.s-nbcnews.com/image/upload/MSNBC/Components/Video/__NEW/nn_netcast_150204.jpg',
},
'skip': 'This page is unavailable.',
}, {
'url': 'http://www.nbcnews.com/nightly-news/video/nightly-news-with-brian-williams-full-broadcast-february-4-394064451844',
'md5': '40d0e48c68896359c80372306ece0fc3',
'info_dict': {
'id': '394064451844',
'ext': 'mp4',
'title': 'Nightly News with Brian Williams Full Broadcast (February 4)',
'description': 'md5:1c10c1eccbe84a26e5debb4381e2d3c5',
'timestamp': 1423104900,
'upload_date': '20150205',
'duration': 1236.0,
'thumbnail': 'https://media-cldnry.s-nbcnews.com/image/upload/MSNBC/Components/Video/__NEW/nn_netcast_150204.jpg',
},
{
'url': 'http://www.nbcnews.com/business/autos/volkswagen-11-million-vehicles-could-have-suspect-software-emissions-scandal-n431456',
'md5': 'ffb59bcf0733dc3c7f0ace907f5e3939',
'info_dict': {
'id': 'n431456',
'ext': 'mp4',
'title': "Volkswagen U.S. Chief: We 'Totally Screwed Up'",
'description': 'md5:d22d1281a24f22ea0880741bb4dd6301',
'upload_date': '20150922',
'timestamp': 1442917800,
'duration': 37.0,
'thumbnail': 'https://media-cldnry.s-nbcnews.com/image/upload/MSNBC/Components/Video/__NEW/x_lon_vwhorn_150922.jpg',
},
}, {
'url': 'http://www.nbcnews.com/business/autos/volkswagen-11-million-vehicles-could-have-suspect-software-emissions-scandal-n431456',
'md5': 'ffb59bcf0733dc3c7f0ace907f5e3939',
'info_dict': {
'id': 'n431456',
'ext': 'mp4',
'title': "Volkswagen U.S. Chief: We 'Totally Screwed Up'",
'description': 'md5:d22d1281a24f22ea0880741bb4dd6301',
'upload_date': '20150922',
'timestamp': 1442917800,
'duration': 37.0,
'thumbnail': 'https://media-cldnry.s-nbcnews.com/image/upload/MSNBC/Components/Video/__NEW/x_lon_vwhorn_150922.jpg',
},
{
'url': 'http://www.today.com/video/see-the-aurora-borealis-from-space-in-stunning-new-nasa-video-669831235788',
'md5': '693d1fa21d23afcc9b04c66b227ed9ff',
'info_dict': {
'id': '669831235788',
'ext': 'mp4',
'title': 'See the aurora borealis from space in stunning new NASA video',
'description': 'md5:74752b7358afb99939c5f8bb2d1d04b1',
'upload_date': '20160420',
'timestamp': 1461152093,
'duration': 69.0,
'thumbnail': 'https://media-cldnry.s-nbcnews.com/image/upload/MSNBC/Components/Video/201604/2016-04-20T11-35-09-133Z--1280x720.jpg',
},
}, {
'url': 'http://www.today.com/video/see-the-aurora-borealis-from-space-in-stunning-new-nasa-video-669831235788',
'md5': '693d1fa21d23afcc9b04c66b227ed9ff',
'info_dict': {
'id': '669831235788',
'ext': 'mp4',
'title': 'See the aurora borealis from space in stunning new NASA video',
'description': 'md5:74752b7358afb99939c5f8bb2d1d04b1',
'upload_date': '20160420',
'timestamp': 1461152093,
'duration': 69.0,
'thumbnail': 'https://media-cldnry.s-nbcnews.com/image/upload/MSNBC/Components/Video/201604/2016-04-20T11-35-09-133Z--1280x720.jpg',
},
{
'url': 'http://www.msnbc.com/all-in-with-chris-hayes/watch/the-chaotic-gop-immigration-vote-314487875924',
'md5': '6d236bf4f3dddc226633ce6e2c3f814d',
'info_dict': {
'id': '314487875924',
'ext': 'mp4',
'title': 'The chaotic GOP immigration vote',
'description': 'The Republican House votes on a border bill that has no chance of getting through the Senate or signed by the President and is drawing criticism from all sides.',
'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1406937606,
'upload_date': '20140802',
'duration': 940.0,
},
'skip': 'Invalid URL',
}, {
'url': 'http://www.msnbc.com/all-in-with-chris-hayes/watch/the-chaotic-gop-immigration-vote-314487875924',
'md5': '6d236bf4f3dddc226633ce6e2c3f814d',
'info_dict': {
'id': '314487875924',
'ext': 'mp4',
'title': 'The chaotic GOP immigration vote',
'description': 'The Republican House votes on a border bill that has no chance of getting through the Senate or signed by the President and is drawing criticism from all sides.',
'thumbnail': r're:https?://.+\.jpg',
'timestamp': 1406937606,
'upload_date': '20140802',
'duration': 940.0,
},
{
'url': 'http://www.nbcnews.com/watch/dateline/full-episode--deadly-betrayal-386250819952',
'only_matching': True,
'skip': 'Invalid URL',
}, {
'url': 'http://www.nbcnews.com/watch/dateline/full-episode--deadly-betrayal-386250819952',
'only_matching': True,
}, {
# From http://www.vulture.com/2016/06/letterman-couldnt-care-less-about-late-night.html
'url': 'http://www.nbcnews.com/widget/video-embed/701714499682',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'http://www.vulture.com/2016/06/letterman-couldnt-care-less-about-late-night.html',
'info_dict': {
'id': 'x_dtl_oa_LettermanliftPR_160608',
'ext': 'mp4',
'title': 'David Letterman: A Preview',
},
{
# From http://www.vulture.com/2016/06/letterman-couldnt-care-less-about-late-night.html
'url': 'http://www.nbcnews.com/widget/video-embed/701714499682',
'only_matching': True,
},
]
'skip': 'Invalid URL',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -610,10 +614,10 @@ class NBCOlympicsIE(InfoExtractor):
'display_id': 'watch-final-minutes-team-usas-mens-basketball-gold',
'title': 'Watch the final minutes of Team USA\'s men\'s basketball gold',
'description': 'md5:f704f591217305c9559b23b877aa8d31',
'episode': 'Watch the final minutes of Team USA\'s men\'s basketball gold',
'uploader': 'NBCU-SPORTS',
'duration': 387.053,
'thumbnail': r're:https://.+/.+\.jpg',
'chapters': [],
'thumbnail': r're:https?://.+\.jpg',
'timestamp': 1723346984,
'upload_date': '20240811',
},
@@ -652,33 +656,31 @@ class NBCOlympicsStreamIE(AdobePassIE):
_WORKING = False
IE_NAME = 'nbcolympics:stream'
_VALID_URL = r'https?://stream\.nbcolympics\.com/(?P<id>[0-9a-z-]+)'
_TESTS = [
{
'note': 'Tokenized m3u8 source URL',
'url': 'https://stream.nbcolympics.com/womens-soccer-group-round-11',
'info_dict': {
'id': '2019740',
'ext': 'mp4',
'title': r"re:Women's Group Stage - Netherlands vs\. Brazil [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$",
},
'params': {
'skip_download': 'm3u8',
},
'skip': 'Livestream',
}, {
'note': 'Plain m3u8 source URL',
'url': 'https://stream.nbcolympics.com/gymnastics-event-finals-mens-floor-pommel-horse-womens-vault-bars',
'info_dict': {
'id': '2021729',
'ext': 'mp4',
'title': r're:Event Finals: M Floor, W Vault, M Pommel, W Uneven Bars [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
},
'params': {
'skip_download': 'm3u8',
},
'skip': 'Livestream',
_TESTS = [{
'note': 'Tokenized m3u8 source URL',
'url': 'https://stream.nbcolympics.com/womens-soccer-group-round-11',
'info_dict': {
'id': '2019740',
'ext': 'mp4',
'title': r"re:Women's Group Stage - Netherlands vs\. Brazil [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$",
},
]
'params': {
'skip_download': 'm3u8',
},
'skip': 'Livestream',
}, {
'note': 'Plain m3u8 source URL',
'url': 'https://stream.nbcolympics.com/gymnastics-event-finals-mens-floor-pommel-horse-womens-vault-bars',
'info_dict': {
'id': '2021729',
'ext': 'mp4',
'title': r're:Event Finals: M Floor, W Vault, M Pommel, W Uneven Bars [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
},
'params': {
'skip_download': 'm3u8',
},
'skip': 'Livestream',
}]
def _real_extract(self, url):
display_id = self._match_id(url)
@@ -758,9 +760,7 @@ class NBCStationsIE(InfoExtractor):
'channel_id': 'KNBC',
'channel': 'nbclosangeles',
},
'params': {
'skip_download': 'm3u8',
},
'skip': 'Site changed',
}, {
'url': 'https://www.telemundoarizona.com/responde/huracan-complica-reembolso-para-televidente-de-tucson/2247002/',
'info_dict': {

View File

@@ -34,7 +34,6 @@ class NetEaseMusicBaseIE(InfoExtractor):
'sky', # SVIP tier; 沉浸环绕声 (Surround Audio); flac
)
_API_BASE = 'http://music.163.com/api/'
_GEO_BYPASS = False
def _create_eapi_cipher(self, api_path, query_body, cookies):
request_text = json.dumps({**query_body, 'header': cookies}, separators=(',', ':'))
@@ -64,6 +63,8 @@ class NetEaseMusicBaseIE(InfoExtractor):
'MUSIC_U': ('MUSIC_U', {lambda i: i.value}),
}),
}
if self._x_forwarded_for_ip:
headers.setdefault('X-Real-IP', self._x_forwarded_for_ip)
return self._download_json(
urljoin('https://interface3.music.163.com/', f'/eapi{path}'), video_id,
data=self._create_eapi_cipher(f'/api{path}', query_body, cookies), headers={

View File

@@ -1,224 +1,48 @@
from .mtv import MTVServicesInfoExtractor
from ..utils import update_url_query
from .mtv import MTVServicesBaseIE
class NickIE(MTVServicesInfoExtractor):
class NickIE(MTVServicesBaseIE):
IE_NAME = 'nick.com'
_VALID_URL = r'https?://(?P<domain>(?:www\.)?nick(?:jr)?\.com)/(?:[^/]+/)?(?P<type>videos/clip|[^/]+/videos|episodes/[^/]+)/(?P<id>[^/?#.]+)'
_FEED_URL = 'http://udat.mtvnservices.com/service1/dispatch.htm'
_GEO_COUNTRIES = ['US']
_VALID_URL = r'https?://(?:www\.)?nick\.com/(?:video-clips|episodes)/(?P<id>[\da-z]{6})'
_TESTS = [{
'url': 'https://www.nick.com/episodes/sq47rw/spongebob-squarepants-a-place-for-pets-lockdown-for-love-season-13-ep-1',
'url': 'https://www.nick.com/episodes/u3smw8/wylde-pak-best-summer-ever-season-1-ep-1',
'info_dict': {
'description': 'md5:0650a9eb88955609d5c1d1c79292e234',
'title': 'A Place for Pets/Lockdown for Love',
},
'playlist': [
{
'md5': 'cb8a2afeafb7ae154aca5a64815ec9d6',
'info_dict': {
'id': '85ee8177-d6ce-48f8-9eee-a65364f8a6df',
'ext': 'mp4',
'title': 'SpongeBob SquarePants: "A Place for Pets/Lockdown for Love" S1',
'description': 'A Place for Pets/Lockdown for Love: When customers bring pets into the Krusty Krab, Mr. Krabs realizes pets are more profitable than owners. Plankton ruins another date with Karen, so she puts the Chum Bucket on lockdown until he proves his affection.',
},
},
{
'md5': '839a04f49900a1fcbf517020d94e0737',
'info_dict': {
'id': '2e2a9960-8fd4-411d-868b-28eb1beb7fae',
'ext': 'mp4',
'title': 'SpongeBob SquarePants: "A Place for Pets/Lockdown for Love" S2',
'description': 'A Place for Pets/Lockdown for Love: When customers bring pets into the Krusty Krab, Mr. Krabs realizes pets are more profitable than owners. Plankton ruins another date with Karen, so she puts the Chum Bucket on lockdown until he proves his affection.',
},
},
{
'md5': 'f1145699f199770e2919ee8646955d46',
'info_dict': {
'id': 'dc91c304-6876-40f7-84a6-7aece7baa9d0',
'ext': 'mp4',
'title': 'SpongeBob SquarePants: "A Place for Pets/Lockdown for Love" S3',
'description': 'A Place for Pets/Lockdown for Love: When customers bring pets into the Krusty Krab, Mr. Krabs realizes pets are more profitable than owners. Plankton ruins another date with Karen, so she puts the Chum Bucket on lockdown until he proves his affection.',
},
},
{
'md5': 'd463116875aee2585ee58de3b12caebd',
'info_dict': {
'id': '5d929486-cf4c-42a1-889a-6e0d183a101a',
'ext': 'mp4',
'title': 'SpongeBob SquarePants: "A Place for Pets/Lockdown for Love" S4',
'description': 'A Place for Pets/Lockdown for Love: When customers bring pets into the Krusty Krab, Mr. Krabs realizes pets are more profitable than owners. Plankton ruins another date with Karen, so she puts the Chum Bucket on lockdown until he proves his affection.',
},
},
],
}, {
'url': 'http://www.nickjr.com/blues-clues-and-you/videos/blues-clues-and-you-original-209-imagination-station/',
'info_dict': {
'id': '31631529-2fc5-430b-b2ef-6a74b4609abd',
'id': 'eb9d4db0-274a-11ef-a913-0e37995d42c9',
'ext': 'mp4',
'description': 'md5:9d65a66df38e02254852794b2809d1cf',
'title': 'Blue\'s Imagination Station',
'display_id': 'u3smw8',
'title': 'Best Summer Ever?',
'description': 'md5:c737a0ade3fbc09d569c3b3d029a7792',
'channel': 'Nickelodeon',
'duration': 1296.0,
'thumbnail': r're:https://assets\.nick\.com/uri/mgid:arc:imageassetref:',
'series': 'Wylde Pak',
'season': 'Season 1',
'season_number': 1,
'episode': 'Episode 1',
'episode_number': 1,
'timestamp': 1746100800,
'upload_date': '20250501',
'release_timestamp': 1746100800,
'release_date': '20250501',
},
'skip': 'Not accessible?',
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.nick.com/video-clips/0p4706/spongebob-squarepants-spongebob-loving-the-krusty-krab-for-7-minutes',
'info_dict': {
'id': '4aac2228-5295-4076-b986-159513cf4ce4',
'ext': 'mp4',
'display_id': '0p4706',
'title': 'SpongeBob Loving the Krusty Krab for 7 Minutes!',
'description': 'md5:72bf59babdf4e6d642187502864e111d',
'duration': 423.423,
'thumbnail': r're:https://assets\.nick\.com/uri/mgid:arc:imageassetref:',
'series': 'SpongeBob SquarePants',
'season': 'Season 0',
'season_number': 0,
'episode': 'Episode 0',
'episode_number': 0,
'timestamp': 1663819200,
'upload_date': '20220922',
},
'params': {'skip_download': 'm3u8'},
}]
def _get_feed_query(self, uri):
return {
'feed': 'nick_arc_player_prime',
'mgid': uri,
}
def _real_extract(self, url):
domain, video_type, display_id = self._match_valid_url(url).groups()
if video_type.startswith('episodes'):
return super()._real_extract(url)
video_data = self._download_json(
f'http://{domain}/data/video.endLevel.json',
display_id, query={
'urlKey': display_id,
})
return self._get_videos_info(video_data['player'] + video_data['id'])
class NickBrIE(MTVServicesInfoExtractor):
IE_NAME = 'nickelodeon:br'
_VALID_URL = r'''(?x)
https?://
(?:
(?P<domain>(?:www\.)?nickjr|mundonick\.uol)\.com\.br|
(?:www\.)?nickjr\.[a-z]{2}|
(?:www\.)?nickelodeonjunior\.fr
)
/(?:programas/)?[^/]+/videos/(?:episodios/)?(?P<id>[^/?\#.]+)
'''
_TESTS = [{
'url': 'http://www.nickjr.com.br/patrulha-canina/videos/210-labirinto-de-pipoca/',
'only_matching': True,
}, {
'url': 'http://mundonick.uol.com.br/programas/the-loud-house/videos/muitas-irmas/7ljo9j',
'only_matching': True,
}, {
'url': 'http://www.nickjr.nl/paw-patrol/videos/311-ge-wol-dig-om-terug-te-zijn/',
'only_matching': True,
}, {
'url': 'http://www.nickjr.de/blaze-und-die-monster-maschinen/videos/f6caaf8f-e4e8-4cc1-b489-9380d6dcd059/',
'only_matching': True,
}, {
'url': 'http://www.nickelodeonjunior.fr/paw-patrol-la-pat-patrouille/videos/episode-401-entier-paw-patrol/',
'only_matching': True,
}]
def _real_extract(self, url):
domain, display_id = self._match_valid_url(url).groups()
webpage = self._download_webpage(url, display_id)
uri = self._search_regex(
r'data-(?:contenturi|mgid)="([^"]+)', webpage, 'mgid')
video_id = self._id_from_uri(uri)
config = self._download_json(
'http://media.mtvnservices.com/pmt/e1/access/index.html',
video_id, query={
'uri': uri,
'configtype': 'edge',
}, headers={
'Referer': url,
})
info_url = self._remove_template_parameter(config['feedWithQueryParams'])
if info_url == 'None':
if domain.startswith('www.'):
domain = domain[4:]
content_domain = {
'mundonick.uol': 'mundonick.com.br',
'nickjr': 'br.nickelodeonjunior.tv',
}[domain]
query = {
'mgid': uri,
'imageEp': content_domain,
'arcEp': content_domain,
}
if domain == 'nickjr.com.br':
query['ep'] = 'c4b16088'
info_url = update_url_query(
'http://feeds.mtvnservices.com/od/feed/intl-mrss-player-feed', query)
return self._get_videos_info_from_url(info_url, video_id)
class NickDeIE(MTVServicesInfoExtractor):
IE_NAME = 'nick.de'
_VALID_URL = r'https?://(?:www\.)?(?P<host>nick\.(?:de|com\.pl|ch)|nickelodeon\.(?:nl|be|at|dk|no|se))/[^/]+/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.nick.de/playlist/3773-top-videos/videos/episode/17306-zu-wasser-und-zu-land-rauchende-erdnusse',
'only_matching': True,
}, {
'url': 'http://www.nick.de/shows/342-icarly',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.nl/shows/474-spongebob/videos/17403-een-kijkje-in-de-keuken-met-sandy-van-binnenuit',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.at/playlist/3773-top-videos/videos/episode/77993-das-letzte-gefecht',
'only_matching': True,
}, {
'url': 'http://www.nick.com.pl/seriale/474-spongebob-kanciastoporty/wideo/17412-teatr-to-jest-to-rodeo-oszolom',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.no/program/2626-bulderhuset/videoer/90947-femteklasse-veronica-vs-vanzilla',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.dk/serier/2626-hojs-hus/videoer/761-tissepause',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.se/serier/2626-lugn-i-stormen/videos/998-',
'only_matching': True,
}, {
'url': 'http://www.nick.ch/shows/2304-adventure-time-abenteuerzeit-mit-finn-und-jake',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.be/afspeellijst/4530-top-videos/videos/episode/73917-inval-broodschapper-lariekoek-arie',
'only_matching': True,
}]
def _get_feed_url(self, uri, url=None):
video_id = self._id_from_uri(uri)
config = self._download_json(
f'http://media.mtvnservices.com/pmt/e1/access/index.html?uri={uri}&configtype=edge&ref={url}', video_id)
return self._remove_template_parameter(config['feedWithQueryParams'])
class NickRuIE(MTVServicesInfoExtractor):
IE_NAME = 'nickelodeonru'
_VALID_URL = r'https?://(?:www\.)nickelodeon\.(?:ru|fr|es|pt|ro|hu|com\.tr)/[^/]+/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.nickelodeon.ru/shows/henrydanger/videos/episodes/3-sezon-15-seriya-licenziya-na-polyot/pmomfb#playlist/7airc6',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.ru/videos/smotri-na-nickelodeon-v-iyule/g9hvh7',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.fr/programmes/bob-l-eponge/videos/le-marathon-de-booh-kini-bottom-mardi-31-octobre/nfn7z0',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.es/videos/nickelodeon-consejos-tortitas/f7w7xy',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.pt/series/spongebob-squarepants/videos/a-bolha-de-tinta-gigante/xutq1b',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.ro/emisiuni/shimmer-si-shine/video/nahal-din-bomboane/uw5u2k',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.hu/musorok/spongyabob-kockanadrag/videok/episodes/buborekfujas-az-elszakadt-nadrag/q57iob#playlist/k6te4y',
'only_matching': True,
}, {
'url': 'http://www.nickelodeon.com.tr/programlar/sunger-bob/videolar/kayip-yatak/mgqbjy',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
mgid = self._extract_mgid(webpage, url)
return self.url_result(f'http://media.mtvnservices.com/embed/{mgid}')

View File

@@ -3,7 +3,6 @@ import functools
import itertools
import json
import re
import time
from .common import InfoExtractor, SearchInfoExtractor
from ..networking.exceptions import HTTPError
@@ -16,12 +15,12 @@ from ..utils import (
float_or_none,
int_or_none,
parse_bitrate,
parse_duration,
parse_iso8601,
parse_qs,
parse_resolution,
qualities,
str_or_none,
time_seconds,
truncate_string,
unified_timestamp,
update_url_query,
@@ -38,8 +37,14 @@ from ..utils.traversal import (
class NiconicoBaseIE(InfoExtractor):
_API_BASE = 'https://nvapi.nicovideo.jp'
_BASE_URL = 'https://www.nicovideo.jp'
_GEO_BYPASS = False
_GEO_COUNTRIES = ['JP']
_HEADERS = {
'X-Frontend-ID': '6',
'X-Frontend-Version': '0',
}
_LOGIN_BASE = 'https://account.nicovideo.jp'
_NETRC_MACHINE = 'niconico'
@@ -99,146 +104,266 @@ class NiconicoIE(NiconicoBaseIE):
IE_NAME = 'niconico'
IE_DESC = 'ニコニコ動画'
_VALID_URL = r'https?://(?:(?:embed|sp|www)\.)?nicovideo\.jp/watch/(?P<id>(?:[a-z]{2})?\d+)'
_ERROR_MAP = {
'FORBIDDEN': {
'ADMINISTRATOR_DELETE_VIDEO': 'Video unavailable, possibly removed by admins',
'CHANNEL_MEMBER_ONLY': 'Channel members only',
'DELETED_CHANNEL_VIDEO': 'Video unavailable, channel was closed',
'DELETED_COMMUNITY_VIDEO': 'Video unavailable, community deleted or missing',
'DEFAULT': 'Page unavailable, check the URL',
'HARMFUL_VIDEO': 'Sensitive content, login required',
'HIDDEN_VIDEO': 'Video unavailable, set to private',
'NOT_ALLOWED': 'No permission',
'PPV_VIDEO': 'PPV video, payment information required',
'PREMIUM_ONLY': 'Premium members only',
},
'INVALID_PARAMETER': {
'DEFAULT': 'Video unavailable, may not exist or was deleted',
},
'MAINTENANCE': {
'DEFAULT': 'Maintenance is in progress',
},
'NOT_FOUND': {
'DEFAULT': 'Video unavailable, may not exist or was deleted',
'RIGHT_HOLDER_DELETE_VIDEO': 'Removed by rights-holder request',
},
'UNAUTHORIZED': {
'DEFAULT': 'Invalid session, re-login required',
},
'UNKNOWN': {
'DEFAULT': 'Failed to fetch content',
},
}
_STATUS_MAP = {
'needs_auth': 'PPV video, payment information required',
'premium_only': 'Premium members only',
'subscriber_only': 'Channel members only',
}
_TESTS = [{
'url': 'http://www.nicovideo.jp/watch/sm22312215',
'url': 'https://www.nicovideo.jp/watch/1173108780',
'info_dict': {
'id': 'sm22312215',
'id': 'sm9',
'ext': 'mp4',
'title': 'Big Buck Bunny',
'thumbnail': r're:https?://.*',
'uploader': 'takuya0301',
'uploader_id': '2698420',
'upload_date': '20131123',
'timestamp': int, # timestamp is unstable
'description': '(c) copyright 2008, Blender Foundation / www.bigbuckbunny.org',
'duration': 33,
'view_count': int,
'title': '新・豪血寺一族 -煩悩解放 - レッツゴー!陰陽師',
'availability': 'public',
'channel': '中の',
'channel_id': '4',
'comment_count': int,
'description': 'md5:b7f6d3e6c29552cc19fdea6a4b7dc194',
'display_id': '1173108780',
'duration': 320,
'genres': ['未設定'],
'tags': [],
'like_count': int,
'tags': 'mincount:5',
'thumbnail': r're:https?://img\.cdn\.nimg\.jp/s/nicovideo/thumbnails/.+',
'timestamp': 1173108780,
'upload_date': '20070305',
'uploader': '中の',
'uploader_id': '4',
'view_count': int,
},
'params': {'skip_download': 'm3u8'},
}, {
# File downloaded with and without credentials are different, so omit
# the md5 field
'url': 'http://www.nicovideo.jp/watch/nm14296458',
'url': 'https://www.nicovideo.jp/watch/sm8628149',
'info_dict': {
'id': 'sm8628149',
'ext': 'mp4',
'title': '【東方】Bad Apple!!\u3000PV【影絵】',
'availability': 'public',
'channel': 'あにら',
'channel_id': '10731211',
'comment_count': int,
'description': 'md5:1999669158cb77a45bab123c4fafe1d7',
'display_id': 'sm8628149',
'duration': 219,
'genres': ['ゲーム'],
'like_count': int,
'tags': 'mincount:3',
'thumbnail': r're:https?://img\.cdn\.nimg\.jp/s/nicovideo/thumbnails/.+',
'timestamp': 1256580802,
'upload_date': '20091026',
'uploader': 'あにら',
'uploader_id': '10731211',
'view_count': int,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.nicovideo.jp/watch/nm14296458',
'info_dict': {
'id': 'nm14296458',
'ext': 'mp4',
'title': 'Kagamine Rin】Dance on media【Original】take2!',
'title': '鏡音リン】Dance on media【オリジナル】take2!',
'availability': 'public',
'channel': 'りょうた',
'channel_id': '18822557',
'comment_count': int,
'description': 'md5:9368f2b1f4178de64f2602c2f3d6cbf5',
'thumbnail': r're:https?://.*',
'display_id': 'nm14296458',
'duration': 208,
'genres': ['音楽・サウンド'],
'like_count': int,
'tags': 'mincount:1',
'thumbnail': r're:https?://img\.cdn\.nimg\.jp/s/nicovideo/thumbnails/.+',
'timestamp': 1304065916,
'upload_date': '20110429',
'uploader': 'りょうた',
'uploader_id': '18822557',
'upload_date': '20110429',
'timestamp': 1304065916,
'duration': 208.0,
'comment_count': int,
'view_count': int,
'genres': ['音楽・サウンド'],
'tags': ['Translation_Request', 'Kagamine_Rin', 'Rin_Original'],
},
'params': {'skip_download': 'm3u8'},
}, {
# 'video exists but is marked as "deleted"
# md5 is unstable
'url': 'http://www.nicovideo.jp/watch/sm10000',
'url': 'https://www.nicovideo.jp/watch/nl1872567',
'info_dict': {
'id': 'sm10000',
'ext': 'unknown_video',
'description': 'deleted',
'title': 'ドラえもんエターナル第3話「決戦第3新東京市」前編',
'thumbnail': r're:https?://.*',
'upload_date': '20071224',
'timestamp': int, # timestamp field has different value if logged in
'duration': 304,
'view_count': int,
},
'skip': 'Requires an account',
}, {
'url': 'http://www.nicovideo.jp/watch/so22543406',
'info_dict': {
'id': '1388129933',
'id': 'nl1872567',
'ext': 'mp4',
'title': '第1回】RADIOアニメロミックス ラブライブのぞえりRadio Garden',
'description': 'md5:b27d224bb0ff53d3c8269e9f8b561cf1',
'thumbnail': r're:https?://.*',
'timestamp': 1388851200,
'upload_date': '20140104',
'uploader': 'アニメロチャンネル',
'uploader_id': '312',
},
'skip': 'The viewing period of the video you were searching for has expired.',
}, {
# video not available via `getflv`; "old" HTML5 video
'url': 'http://www.nicovideo.jp/watch/sm1151009',
'info_dict': {
'id': 'sm1151009',
'ext': 'mp4',
'title': 'マスターシステム本体内蔵のスペハリのメインテーマ(PSG版)',
'description': 'md5:f95a3d259172667b293530cc2e41ebda',
'thumbnail': r're:https?://.*',
'duration': 184,
'timestamp': 1190835883,
'upload_date': '20070926',
'uploader': 'denden2',
'uploader_id': '1392194',
'view_count': int,
'comment_count': int,
'genres': ['ゲーム'],
'tags': [],
},
'params': {'skip_download': 'm3u8'},
}, {
# "New" HTML5 video
'url': 'http://www.nicovideo.jp/watch/sm31464864',
'info_dict': {
'id': 'sm31464864',
'ext': 'mp4',
'title': '新作TVアニメ「戦姫絶唱シンフォギアAXZ」PV 最高画質',
'description': 'md5:e52974af9a96e739196b2c1ca72b5feb',
'timestamp': 1498481660,
'upload_date': '20170626',
'uploader': 'no-namamae',
'uploader_id': '40826363',
'thumbnail': r're:https?://.*',
'duration': 198,
'view_count': int,
'comment_count': int,
'genres': ['アニメ'],
'tags': [],
},
'params': {'skip_download': 'm3u8'},
}, {
# Video without owner
'url': 'http://www.nicovideo.jp/watch/sm18238488',
'info_dict': {
'id': 'sm18238488',
'ext': 'mp4',
'title': '【実写版】ミュータントタートルズ',
'description': 'md5:15df8988e47a86f9e978af2064bf6d8e',
'timestamp': 1341128008,
'upload_date': '20120701',
'thumbnail': r're:https?://.*',
'duration': 5271,
'view_count': int,
'title': '12/25放送分】『生対談!!ひろゆきと戀塚のニコニコを作った人 』前半',
'availability': 'public',
'channel': 'nicolive',
'channel_id': '394',
'comment_count': int,
'description': 'md5:79fc3a54cfdc93ecc2b883285149e548',
'display_id': 'nl1872567',
'duration': 586,
'genres': ['エンターテイメント'],
'tags': [],
'like_count': int,
'tags': 'mincount:3',
'thumbnail': r're:https?://img\.cdn\.nimg\.jp/s/nicovideo/thumbnails/.+',
'timestamp': 1198637246,
'upload_date': '20071226',
'uploader': 'nicolive',
'uploader_id': '394',
'view_count': int,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'http://sp.nicovideo.jp/watch/sm28964488?ss_pos=1&cp_in=wt_tg',
'only_matching': True,
}, {
'note': 'a video that is only served as an ENCRYPTED HLS.',
'url': 'https://www.nicovideo.jp/watch/so38016254',
'only_matching': True,
'info_dict': {
'id': 'so38016254',
'ext': 'mp4',
'title': '「のんのんびより のんすとっぷ」 PV',
'availability': 'public',
'channel': 'のんのんびより のんすとっぷ',
'channel_id': 'ch2647028',
'comment_count': int,
'description': 'md5:6e2ff55b33e3645d59ef010869cde6a2',
'display_id': 'so38016254',
'duration': 114,
'genres': ['アニメ'],
'like_count': int,
'tags': 'mincount:4',
'thumbnail': r're:https?://img\.cdn\.nimg\.jp/s/nicovideo/thumbnails/.+',
'timestamp': 1609146000,
'upload_date': '20201228',
'uploader': 'のんのんびより のんすとっぷ',
'uploader_id': 'ch2647028',
'view_count': int,
},
'params': {'skip_download': 'm3u8'},
}, {
# smile official, but marked as user video
'url': 'https://www.nicovideo.jp/watch/so37602536',
'info_dict': {
'id': 'so37602536',
'ext': 'mp4',
'title': '田中有紀とゆきだるまと! 限定放送アーカイブ第12回',
'availability': 'subscriber_only',
'channel': 'あみあみ16',
'channel_id': '91072761',
'comment_count': int,
'description': 'md5:2ee357ec4e76d7804fb59af77107ab67',
'display_id': 'so37602536',
'duration': 980,
'genres': ['エンターテイメント'],
'like_count': int,
'tags': 'count:4',
'thumbnail': r're:https?://img\.cdn\.nimg\.jp/s/nicovideo/thumbnails/.+',
'timestamp': 1601377200,
'upload_date': '20200929',
'uploader': 'あみあみ16',
'uploader_id': '91072761',
'view_count': int,
},
'params': {'skip_download': 'm3u8'},
'skip': 'Channel members only',
}, {
'url': 'https://www.nicovideo.jp/watch/so41370536',
'info_dict': {
'id': 'so41370536',
'ext': 'mp4',
'title': 'ZUN【出演者別】超パーティー2022',
'availability': 'premium_only',
'channel': 'ニコニコ超会議チャンネル',
'channel_id': 'ch2607134',
'comment_count': int,
'description': 'md5:5692db5ac40d3a374fc5ec182d0249c3',
'display_id': 'so41370536',
'duration': 63,
'genres': ['音楽・サウンド'],
'like_count': int,
'tags': 'mincount:5',
'thumbnail': r're:https?://img\.cdn\.nimg\.jp/s/nicovideo/thumbnails/.+',
'timestamp': 1668394800,
'upload_date': '20221114',
'uploader': 'ニコニコ超会議チャンネル',
'uploader_id': 'ch2607134',
'view_count': int,
},
'params': {'skip_download': 'm3u8'},
'skip': 'Premium members only',
}, {
'url': 'https://www.nicovideo.jp/watch/so37574174',
'info_dict': {
'id': 'so37574174',
'ext': 'mp4',
'title': 'ひぐらしのなく頃に 廿回し編\u3000第1回',
'availability': 'subscriber_only',
'channel': '「ひぐらしのなく頃に」オフィシャルチャンネル',
'channel_id': 'ch2646036',
'comment_count': int,
'description': 'md5:5296196d51d9c0b7272b73f9a99c236a',
'display_id': 'so37574174',
'duration': 1931,
'genres': ['ラジオ'],
'like_count': int,
'tags': 'mincount:5',
'thumbnail': r're:https?://img\.cdn\.nimg\.jp/s/nicovideo/thumbnails/.+',
'timestamp': 1601028000,
'upload_date': '20200925',
'uploader': '「ひぐらしのなく頃に」オフィシャルチャンネル',
'uploader_id': 'ch2646036',
'view_count': int,
},
'params': {'skip_download': 'm3u8'},
'skip': 'Channel members only',
}, {
'url': 'https://www.nicovideo.jp/watch/so44060088',
'info_dict': {
'id': 'so44060088',
'ext': 'mp4',
'title': '松田的超英雄電波。《仮面ライダーガッチャード 放送終了記念特別番組》',
'availability': 'subscriber_only',
'channel': 'あみあみチャンネル',
'channel_id': 'ch2638921',
'comment_count': int,
'description': 'md5:9dec5bb9a172b6d20a255ecb64fbd03e',
'display_id': 'so44060088',
'duration': 1881,
'genres': ['ラジオ'],
'like_count': int,
'tags': 'mincount:7',
'thumbnail': r're:https?://img\.cdn\.nimg\.jp/s/nicovideo/thumbnails/.+',
'timestamp': 1725361200,
'upload_date': '20240903',
'uploader': 'あみあみチャンネル',
'uploader_id': 'ch2638921',
'view_count': int,
},
'params': {'skip_download': 'm3u8'},
'skip': 'Channel members only; specified continuous membership period required',
}]
_VALID_URL = r'https?://(?:(?:www\.|secure\.|sp\.)?nicovideo\.jp/watch|nico\.ms)/(?P<id>(?:[a-z]{2})?[0-9]+)'
def _yield_dms_formats(self, api_data, video_id):
def _extract_formats(self, api_data, video_id):
fmt_filter = lambda _, v: v['isAvailable'] and v['id']
videos = traverse_obj(api_data, ('media', 'domand', 'videos', fmt_filter))
audios = traverse_obj(api_data, ('media', 'domand', 'audios', fmt_filter))
@@ -247,164 +372,135 @@ class NiconicoIE(NiconicoBaseIE):
if not all((videos, audios, access_key, track_id)):
return
dms_m3u8_url = self._download_json(
f'https://nvapi.nicovideo.jp/v1/watch/{video_id}/access-rights/hls', video_id,
data=json.dumps({
m3u8_url = self._download_json(
f'{self._API_BASE}/v1/watch/{video_id}/access-rights/hls',
video_id, headers={
'Accept': 'application/json;charset=utf-8',
'Content-Type': 'application/json',
'X-Access-Right-Key': access_key,
'X-Request-With': self._BASE_URL,
**self._HEADERS,
}, query={
'actionTrackId': track_id,
}, data=json.dumps({
'outputs': list(itertools.product((v['id'] for v in videos), (a['id'] for a in audios))),
}).encode(), query={'actionTrackId': track_id}, headers={
'x-access-right-key': access_key,
'x-frontend-id': 6,
'x-frontend-version': 0,
'x-request-with': 'https://www.nicovideo.jp',
})['data']['contentUrl']
# Getting all audio formats results in duplicate video formats which we filter out later
dms_fmts = self._extract_m3u8_formats(dms_m3u8_url, video_id, 'mp4')
}).encode(),
)['data']['contentUrl']
raw_fmts = self._extract_m3u8_formats(m3u8_url, video_id, 'mp4')
# m3u8 extraction does not provide audio bitrates, so extract from the API data and fix
for audio_fmt in traverse_obj(dms_fmts, lambda _, v: v['vcodec'] == 'none'):
yield {
**audio_fmt,
**traverse_obj(audios, (lambda _, v: audio_fmt['format_id'].startswith(v['id']), {
'format_id': ('id', {str}),
formats = []
for a_fmt in traverse_obj(raw_fmts, lambda _, v: v['vcodec'] == 'none'):
formats.append({
**a_fmt,
**traverse_obj(audios, (lambda _, v: a_fmt['format_id'].startswith(v['id']), {
'abr': ('bitRate', {float_or_none(scale=1000)}),
'asr': ('samplingRate', {int_or_none}),
'format_id': ('id', {str}),
'quality': ('qualityLevel', {int_or_none}),
}), get_all=False),
}, any)),
'acodec': 'aac',
}
})
# Sort before removing dupes to keep the format dicts with the lowest tbr
video_fmts = sorted((fmt for fmt in dms_fmts if fmt['vcodec'] != 'none'), key=lambda f: f['tbr'])
self._remove_duplicate_formats(video_fmts)
# Sort first, keeping the lowest-tbr formats
v_fmts = sorted((fmt for fmt in raw_fmts if fmt['vcodec'] != 'none'), key=lambda f: f['tbr'])
self._remove_duplicate_formats(v_fmts)
# Calculate the true vbr/tbr by subtracting the lowest abr
min_abr = min(traverse_obj(audios, (..., 'bitRate', {float_or_none})), default=0) / 1000
for video_fmt in video_fmts:
video_fmt['tbr'] -= min_abr
video_fmt['format_id'] = url_basename(video_fmt['url']).rpartition('.')[0]
video_fmt['quality'] = traverse_obj(videos, (
lambda _, v: v['id'] == video_fmt['format_id'], 'qualityLevel', {int_or_none}, any)) or -1
yield video_fmt
min_abr = traverse_obj(audios, (..., 'bitRate', {float_or_none(scale=1000)}, all, {min})) or 0
for v_fmt in v_fmts:
v_fmt['format_id'] = url_basename(v_fmt['url']).rpartition('.')[0]
v_fmt['quality'] = traverse_obj(videos, (
lambda _, v: v['id'] == v_fmt['format_id'], 'qualityLevel', {int_or_none}, any)) or -1
v_fmt['tbr'] -= min_abr
formats.extend(v_fmts)
def _extract_server_response(self, webpage, video_id, fatal=True):
try:
return traverse_obj(
self._parse_json(self._html_search_meta('server-response', webpage) or '', video_id),
('data', 'response', {dict}, {require('server response')}))
except ExtractorError:
if not fatal:
return {}
raise
return formats
def _real_extract(self, url):
video_id = self._match_id(url)
try:
webpage, handle = self._download_webpage_handle(
f'https://www.nicovideo.jp/watch/{video_id}', video_id,
headers=self.geo_verification_headers())
if video_id.startswith('so'):
video_id = self._match_id(handle.url)
path = 'v3' if self.is_logged_in else 'v3_guest'
api_resp = self._download_json(
f'{self._BASE_URL}/api/watch/{path}/{video_id}', video_id,
'Downloading API JSON', 'Unable to fetch data', headers={
**self._HEADERS,
**self.geo_verification_headers(),
}, query={
'actionTrackId': f'AAAAAAAAAA_{round(time_seconds() * 1000)}',
}, expected_status=[400, 404])
api_data = self._extract_server_response(webpage, video_id)
except ExtractorError as e:
try:
api_data = self._download_json(
f'https://www.nicovideo.jp/api/watch/v3/{video_id}', video_id,
'Downloading API JSON', 'Unable to fetch data', query={
'_frontendId': '6',
'_frontendVersion': '0',
'actionTrackId': f'AAAAAAAAAA_{round(time.time() * 1000)}',
}, headers=self.geo_verification_headers())['data']
except ExtractorError:
if not isinstance(e.cause, HTTPError):
# Raise if original exception was from _parse_json or utils.traversal.require
raise
# The webpage server response has more detailed error info than the API response
webpage = e.cause.response.read().decode('utf-8', 'replace')
reason_code = self._extract_server_response(
webpage, video_id, fatal=False).get('reasonCode')
if not reason_code:
raise
if reason_code in ('DOMESTIC_VIDEO', 'HIGH_RISK_COUNTRY_VIDEO'):
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
elif reason_code == 'HIDDEN_VIDEO':
raise ExtractorError(
'The viewing period of this video has expired', expected=True)
elif reason_code == 'DELETED_VIDEO':
raise ExtractorError('This video has been deleted', expected=True)
raise ExtractorError(f'Niconico says: {reason_code}')
api_data = api_resp['data']
scheduled_time = traverse_obj(api_data, ('publishScheduledAt', {str}))
status = traverse_obj(api_resp, ('meta', 'status', {int}))
availability = self._availability(**(traverse_obj(api_data, ('payment', 'video', {
'needs_premium': ('isPremium', {bool}),
if status != 200:
err_code = traverse_obj(api_resp, ('meta', 'errorCode', {str.upper}))
reason_code = traverse_obj(api_data, ('reasonCode', {str_or_none}))
err_msg = traverse_obj(self._ERROR_MAP, (err_code, (reason_code, 'DEFAULT'), {str}, any))
if reason_code in ('DOMESTIC_VIDEO', 'HIGH_RISK_COUNTRY_VIDEO'):
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
elif reason_code == 'HARMFUL_VIDEO' and traverse_obj(api_data, (
'viewer', 'allowSensitiveContents', {bool},
)) is False:
err_msg = 'Sensitive content, adjust display settings to watch'
elif reason_code == 'HIDDEN_VIDEO' and scheduled_time:
err_msg = f'This content is scheduled to be released at {scheduled_time}'
elif reason_code in ('CHANNEL_MEMBER_ONLY', 'HARMFUL_VIDEO', 'HIDDEN_VIDEO', 'PPV_VIDEO', 'PREMIUM_ONLY'):
self.raise_login_required(err_msg)
if err_msg:
raise ExtractorError(err_msg, expected=True)
if status and status >= 500:
raise ExtractorError('Service temporarily unavailable', expected=True)
raise ExtractorError(f'API returned error status {status}')
availability = self._availability(**traverse_obj(api_data, ('payment', 'video', {
'needs_auth': (('isContinuationBenefit', 'isPpv'), {bool}, any),
'needs_subscription': ('isAdmission', {bool}),
})) or {'needs_auth': True}))
'needs_premium': ('isPremium', {bool}),
}))) or 'public'
formats = list(self._yield_dms_formats(api_data, video_id))
if not formats:
fail_msg = clean_html(self._html_search_regex(
r'<p[^>]+\bclass="fail-message"[^>]*>(?P<msg>.+?)</p>',
webpage, 'fail message', default=None, group='msg'))
if fail_msg:
self.to_screen(f'Niconico said: {fail_msg}')
if fail_msg and 'された地域と同じ地域からのみ視聴できます。' in fail_msg:
availability = None
self.raise_geo_restricted(countries=self._GEO_COUNTRIES, metadata_available=True)
elif availability == 'premium_only':
self.raise_login_required('This video requires premium', metadata_available=True)
elif availability == 'subscriber_only':
self.raise_login_required('This video is for members only', metadata_available=True)
elif availability == 'needs_auth':
self.raise_login_required(metadata_available=False)
# Start extracting information
tags = None
if webpage:
# use og:video:tag (not logged in)
og_video_tags = re.finditer(r'<meta\s+property="og:video:tag"\s*content="(.*?)">', webpage)
tags = list(filter(None, (clean_html(x.group(1)) for x in og_video_tags)))
if not tags:
# use keywords and split with comma (not logged in)
kwds = self._html_search_meta('keywords', webpage, default=None)
if kwds:
tags = [x for x in kwds.split(',') if x]
if not tags:
# find in json (logged in)
tags = traverse_obj(api_data, ('tag', 'items', ..., 'name'))
formats = self._extract_formats(api_data, video_id)
err_msg = self._STATUS_MAP.get(availability)
if not formats and err_msg:
self.raise_login_required(err_msg, metadata_available=True)
thumb_prefs = qualities(['url', 'middleUrl', 'largeUrl', 'player', 'ogp'])
def get_video_info(*items, get_first=True, **kwargs):
return traverse_obj(api_data, ('video', *items), get_all=not get_first, **kwargs)
return {
'id': video_id,
'_api_data': api_data,
'title': get_video_info(('originalTitle', 'title')) or self._og_search_title(webpage, default=None),
'formats': formats,
'availability': availability,
'thumbnails': [{
'id': key,
'url': url,
'ext': 'jpg',
'preference': thumb_prefs(key),
**parse_resolution(url, lenient=True),
} for key, url in (get_video_info('thumbnail') or {}).items() if url],
'description': clean_html(get_video_info('description')),
'uploader': traverse_obj(api_data, ('owner', 'nickname'), ('channel', 'name'), ('community', 'name')),
'uploader_id': str_or_none(traverse_obj(api_data, ('owner', 'id'), ('channel', 'id'), ('community', 'id'))),
'timestamp': parse_iso8601(get_video_info('registeredAt')) or parse_iso8601(
self._html_search_meta('video:release_date', webpage, 'date published', default=None)),
'channel': traverse_obj(api_data, ('channel', 'name'), ('community', 'name')),
'channel_id': traverse_obj(api_data, ('channel', 'id'), ('community', 'id')),
'view_count': int_or_none(get_video_info('count', 'view')),
'tags': tags,
'genre': traverse_obj(api_data, ('genre', 'label'), ('genre', 'key')),
'comment_count': get_video_info('count', 'comment', expected_type=int),
'duration': (
parse_duration(self._html_search_meta('video:duration', webpage, 'video duration', default=None))
or get_video_info('duration')),
'webpage_url': url_or_none(url) or f'https://www.nicovideo.jp/watch/{video_id}',
'display_id': video_id,
'formats': formats,
'genres': traverse_obj(api_data, ('genre', 'label', {str}, filter, all, filter)),
'release_timestamp': parse_iso8601(scheduled_time),
'subtitles': self.extract_subtitles(video_id, api_data),
'tags': traverse_obj(api_data, ('tag', 'items', ..., 'name', {str}, filter, all, filter)),
'thumbnails': [{
'ext': 'jpg',
'id': key,
'preference': thumb_prefs(key),
'url': url,
**parse_resolution(url, lenient=True),
} for key, url in traverse_obj(api_data, (
'video', 'thumbnail', {dict}), default={}).items()],
**traverse_obj(api_data, (('channel', 'owner'), any, {
'channel': (('name', 'nickname'), {str}, any),
'channel_id': ('id', {str_or_none}),
'uploader': (('name', 'nickname'), {str}, any),
'uploader_id': ('id', {str_or_none}),
})),
**traverse_obj(api_data, ('video', {
'id': ('id', {str_or_none}),
'title': ('title', {str}),
'description': ('description', {clean_html}, filter),
'duration': ('duration', {int_or_none}),
'timestamp': ('registeredAt', {parse_iso8601}),
})),
**traverse_obj(api_data, ('video', 'count', {
'comment_count': ('comment', {int_or_none}),
'like_count': ('like', {int_or_none}),
'view_count': ('view', {int_or_none}),
})),
}
def _get_subtitles(self, video_id, api_data):
@@ -413,21 +509,19 @@ class NiconicoIE(NiconicoBaseIE):
return
danmaku = traverse_obj(self._download_json(
f'{comments_info["server"]}/v1/threads', video_id, data=json.dumps({
f'{comments_info["server"]}/v1/threads', video_id,
'Downloading comments', 'Failed to download comments', headers={
'Content-Type': 'text/plain;charset=UTF-8',
'Origin': self._BASE_URL,
'Referer': f'{self._BASE_URL}/',
'X-Client-Os-Type': 'others',
**self._HEADERS,
}, data=json.dumps({
'additionals': {},
'params': comments_info.get('params'),
'threadKey': comments_info.get('threadKey'),
}).encode(), fatal=False,
headers={
'Referer': 'https://www.nicovideo.jp/',
'Origin': 'https://www.nicovideo.jp',
'Content-Type': 'text/plain;charset=UTF-8',
'x-client-os-type': 'others',
'x-frontend-id': '6',
'x-frontend-version': '0',
},
note='Downloading comments', errnote='Failed to download comments'),
('data', 'threads', ..., 'comments', ...))
), ('data', 'threads', ..., 'comments', ...))
return {
'comments': [{
@@ -779,39 +873,67 @@ class NiconicoLiveIE(NiconicoBaseIE):
IE_DESC = 'ニコニコ生放送'
_VALID_URL = r'https?://(?:sp\.)?live2?\.nicovideo\.jp/(?:watch|gate)/(?P<id>lv\d+)'
_TESTS = [{
'note': 'this test case includes invisible characters for title, pasting them as-is',
'url': 'https://live.nicovideo.jp/watch/lv339533123',
'url': 'https://live.nicovideo.jp/watch/lv329299587',
'info_dict': {
'id': 'lv339533123',
'title': '激辛ペヤング食べます\u202a( ;ᯅ; )\u202c(歌枠オーディション参加中)',
'view_count': 1526,
'comment_count': 1772,
'description': '初めましてもかって言います❕\nのんびり自由に適当に暮らしてます',
'uploader': 'もか',
'channel': 'ゲストさんのコミュニティ',
'channel_id': 'co5776900',
'channel_url': 'https://com.nicovideo.jp/community/co5776900',
'timestamp': 1670677328,
'is_live': True,
'id': 'lv329299587',
'ext': 'mp4',
'title': str,
'channel': 'ニコニコエンタメチャンネル',
'channel_id': 'ch2640322',
'channel_url': 'https://ch.nicovideo.jp/channel/ch2640322',
'comment_count': int,
'description': 'md5:281edd7f00309e99ec46a87fb16d7033',
'live_status': 'is_live',
'thumbnail': r're:https?://.+',
'timestamp': 1608803400,
'upload_date': '20201224',
'uploader': '株式会社ドワンゴ',
'view_count': int,
},
'skip': 'livestream',
'params': {'skip_download': True},
}, {
'url': 'https://live2.nicovideo.jp/watch/lv339533123',
'only_matching': True,
}, {
'url': 'https://sp.live.nicovideo.jp/watch/lv339533123',
'only_matching': True,
}, {
'url': 'https://sp.live2.nicovideo.jp/watch/lv339533123',
'only_matching': True,
'url': 'https://live.nicovideo.jp/watch/lv331050399',
'info_dict': {
'id': 'lv331050399',
'ext': 'mp4',
'title': str,
'age_limit': 18,
'channel': 'みんなのおもちゃ REBOOT',
'channel_id': 'ch2642088',
'channel_url': 'https://ch.nicovideo.jp/channel/ch2642088',
'comment_count': int,
'description': 'md5:8d0bb5beaca73b911725478a1e7c7b91',
'live_status': 'is_live',
'thumbnail': r're:https?://.+',
'timestamp': 1617029400,
'upload_date': '20210329',
'uploader': '株式会社ドワンゴ',
'view_count': int,
},
'params': {'skip_download': True},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id, expected_status=404)
webpage, urlh = self._download_webpage_handle(url, video_id, expected_status=404)
if err_msg := traverse_obj(webpage, ({find_element(cls='message')}, {clean_html})):
raise ExtractorError(err_msg, expected=True)
age_limit = 18 if 'age_auth' in urlh.url else None
if age_limit:
if not self.is_logged_in:
self.raise_login_required('Login is required to access age-restricted content')
my = self._download_webpage('https://www.nicovideo.jp/my', None, 'Checking age verification')
if traverse_obj(my, (
{find_element(id='js-initial-userpage-data', html=True)}, {extract_attributes},
'data-environment', {json.loads}, 'allowSensitiveContents', {bool},
)):
self._set_cookie('.nicovideo.jp', 'age_auth', '1')
webpage = self._download_webpage(url, video_id)
else:
raise ExtractorError('Sensitive content setting must be enabled', expected=True)
embedded_data = traverse_obj(webpage, (
{find_element(tag='script', id='embedded-data', html=True)},
{extract_attributes}, 'data-props', {json.loads}))
@@ -914,6 +1036,7 @@ class NiconicoLiveIE(NiconicoBaseIE):
return {
'id': video_id,
'title': title,
'age_limit': age_limit,
'downloader_options': {
'max_quality': traverse_obj(embedded_data, ('program', 'stream', 'maxQuality', {str})) or 'normal',
'ws': ws,

View File

@@ -446,9 +446,10 @@ class NRKTVIE(InfoExtractor):
class NRKTVEpisodeIE(InfoExtractor):
_VALID_URL = r'https?://tv\.nrk\.no/serie/(?P<id>[^/]+/sesong/(?P<season_number>\d+)/episode/(?P<episode_number>\d+))'
_VALID_URL = r'https?://tv\.nrk\.no/serie/(?P<id>[^/?#]+/sesong/(?P<season_number>\d+)/episode/(?P<episode_number>\d+))'
_TESTS = [{
'url': 'https://tv.nrk.no/serie/hellums-kro/sesong/1/episode/2',
'add_ie': [NRKIE.ie_key()],
'info_dict': {
'id': 'MUHH36005220',
'ext': 'mp4',
@@ -485,26 +486,23 @@ class NRKTVEpisodeIE(InfoExtractor):
}]
def _real_extract(self, url):
display_id, season_number, episode_number = self._match_valid_url(url).groups()
display_id, season_number, episode_number = self._match_valid_url(url).group(
'id', 'season_number', 'episode_number')
webpage, urlh = self._download_webpage_handle(url, display_id)
webpage = self._download_webpage(url, display_id)
if NRKTVIE.suitable(urlh.url):
nrk_id = NRKTVIE._match_id(urlh.url)
else:
nrk_id = self._search_json(
r'<script\b[^>]+\bid="pageData"[^>]*>', webpage,
'page data', display_id)['initialState']['selectedEpisodePrfId']
if not re.fullmatch(NRKTVIE._EPISODE_RE, nrk_id):
raise ExtractorError('Unable to extract NRK ID')
info = self._search_json_ld(webpage, display_id, default={})
nrk_id = info.get('@id') or self._html_search_meta(
'nrk:program-id', webpage, default=None) or self._search_regex(
rf'data-program-id=["\']({NRKTVIE._EPISODE_RE})', webpage,
'nrk id')
assert re.match(NRKTVIE._EPISODE_RE, nrk_id)
info.update({
'_type': 'url',
'id': nrk_id,
'url': f'nrk:{nrk_id}',
'ie_key': NRKIE.ie_key(),
'season_number': int(season_number),
'episode_number': int(episode_number),
})
return info
return self.url_result(
f'nrk:{nrk_id}', NRKIE, nrk_id,
season_number=int(season_number),
episode_number=int(episode_number))
class NRKTVSerieBaseIE(NRKBaseIE):

View File

@@ -352,14 +352,6 @@ class OdnoklassnikiIE(InfoExtractor):
'subtitles': subtitles,
}
# pladform
if provider == 'OPEN_GRAPH':
info.update({
'_type': 'url_transparent',
'url': movie['contentId'],
})
return info
if provider == 'USER_YOUTUBE':
info.update({
'_type': 'url_transparent',

View File

@@ -73,163 +73,179 @@ class PanoptoBaseIE(InfoExtractor):
class PanoptoIE(PanoptoBaseIE):
_VALID_URL = PanoptoBaseIE.BASE_URL_RE + r'/Pages/(Viewer|Embed)\.aspx.*(?:\?|&)id=(?P<id>[a-f0-9-]+)'
_EMBED_REGEX = [rf'<iframe[^>]+src=["\'](?P<url>{PanoptoBaseIE.BASE_URL_RE}/Pages/(Viewer|Embed|Sessions/List)\.aspx[^"\']+)']
_TESTS = [
{
'url': 'https://demo.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=26b3ae9e-4a48-4dcc-96ba-0befba08a0fb',
'info_dict': {
'id': '26b3ae9e-4a48-4dcc-96ba-0befba08a0fb',
'title': 'Panopto for Business - Use Cases',
'timestamp': 1459184200,
'thumbnail': r're:https://demo\.hosted\.panopto\.com/.+',
'upload_date': '20160328',
'ext': 'mp4',
'cast': [],
'chapters': [],
'duration': 88.17099999999999,
'average_rating': int,
'uploader_id': '2db6b718-47a0-4b0b-9e17-ab0b00f42b1e',
'channel_id': 'e4c6a2fc-1214-4ca0-8fb7-aef2e29ff63a',
'channel': 'Showcase Videos',
},
_TESTS = [{
'url': 'https://demo.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=26b3ae9e-4a48-4dcc-96ba-0befba08a0fb',
'info_dict': {
'id': '26b3ae9e-4a48-4dcc-96ba-0befba08a0fb',
'title': 'Panopto for Business - Use Cases',
'timestamp': 1459184200,
'thumbnail': r're:https?://demo\.hosted\.panopto\.com/.+',
'upload_date': '20160328',
'ext': 'mp4',
'cast': [],
'chapters': [],
'duration': 88.17099999999999,
'average_rating': int,
'tags': [],
'uploader_id': '2db6b718-47a0-4b0b-9e17-ab0b00f42b1e',
'channel_id': 'bb0b58ff-b31b-47a0-9aa2-af6f0113613a',
'channel': 'Product',
},
{
'url': 'https://demo.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=ed01b077-c9e5-4c7b-b8ff-15fa306d7a59',
'info_dict': {
'id': 'ed01b077-c9e5-4c7b-b8ff-15fa306d7a59',
'title': 'Overcoming Top 4 Challenges of Enterprise Video',
'uploader': 'Panopto Support',
'timestamp': 1449409251,
'thumbnail': r're:https://demo\.hosted\.panopto\.com/.+',
'upload_date': '20151206',
'ext': 'mp4',
'chapters': 'count:12',
'cast': ['Panopto Support'],
'uploader_id': 'a96d1a31-b4de-489b-9eee-b4a5b414372c',
'average_rating': int,
'description': 'md5:4391837802b3fc856dadf630c4b375d1',
'duration': 1088.2659999999998,
'channel_id': '9f3c1921-43bb-4bda-8b3a-b8d2f05a8546',
'channel': 'Webcasts',
},
}, {
'url': 'https://demo.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=ed01b077-c9e5-4c7b-b8ff-15fa306d7a59',
'info_dict': {
'id': 'ed01b077-c9e5-4c7b-b8ff-15fa306d7a59',
'title': 'Overcoming Top 4 Challenges of Enterprise Video',
'uploader': 'Panopto Support',
'timestamp': 1449409251,
'thumbnail': r're:https?://demo\.hosted\.panopto\.com/.+',
'upload_date': '20151206',
'ext': 'mp4',
'chapters': 'count:13',
'cast': ['Panopto Support'],
'tags': [],
'uploader_id': 'a96d1a31-b4de-489b-9eee-b4a5b414372c',
'average_rating': int,
'description': 'md5:4391837802b3fc856dadf630c4b375d1',
'duration': 1088.2659999999998,
'channel_id': '9f3c1921-43bb-4bda-8b3a-b8d2f05a8546',
'channel': 'Webcasts',
},
{
# Extra params in URL
'url': 'https://howtovideos.hosted.panopto.com/Panopto/Pages/Viewer.aspx?randomparam=thisisnotreal&id=5fa74e93-3d87-4694-b60e-aaa4012214ed&advance=true',
'info_dict': {
'id': '5fa74e93-3d87-4694-b60e-aaa4012214ed',
'ext': 'mp4',
'duration': 129.513,
'cast': ['Kathryn Kelly'],
'uploader_id': '316a0a58-7fa2-4cd9-be1c-64270d284a56',
'timestamp': 1569845768,
'tags': ['Viewer', 'Enterprise'],
'chapters': [],
'upload_date': '20190930',
'thumbnail': r're:https://howtovideos\.hosted\.panopto\.com/.+',
'description': 'md5:2d844aaa1b1a14ad0e2601a0993b431f',
'title': 'Getting Started: View a Video',
'average_rating': int,
'uploader': 'Kathryn Kelly',
'channel_id': 'fb93bc3c-6750-4b80-a05b-a921013735d3',
'channel': 'Getting Started',
},
}, {
# Extra params in URL
'url': 'https://howtovideos.hosted.panopto.com/Panopto/Pages/Viewer.aspx?randomparam=thisisnotreal&id=5fa74e93-3d87-4694-b60e-aaa4012214ed&advance=true',
'info_dict': {
'id': '5fa74e93-3d87-4694-b60e-aaa4012214ed',
'ext': 'mp4',
'duration': 129.513,
'cast': ['Kathryn Kelly'],
'uploader_id': '316a0a58-7fa2-4cd9-be1c-64270d284a56',
'timestamp': 1569845768,
'tags': ['Viewer', 'Enterprise'],
'chapters': [],
'upload_date': '20190930',
'thumbnail': r're:https?://howtovideos\.hosted\.panopto\.com/.+',
'description': 'md5:2d844aaa1b1a14ad0e2601a0993b431f',
'title': 'Getting Started: View a Video',
'average_rating': int,
'uploader': 'Kathryn Kelly',
'channel_id': 'fb93bc3c-6750-4b80-a05b-a921013735d3',
'channel': 'Getting Started',
},
{
# Does not allow normal Viewer.aspx. AUDIO livestream has no url, so should be skipped and only give one stream.
'url': 'https://unisa.au.panopto.com/Panopto/Pages/Embed.aspx?id=9d9a0fa3-e99a-4ebd-a281-aac2017f4da4',
'info_dict': {
'id': '9d9a0fa3-e99a-4ebd-a281-aac2017f4da4',
'ext': 'mp4',
'cast': ['LTS CLI Script'],
'chapters': [],
'duration': 2178.45,
'description': 'md5:ee5cf653919f55b72bce2dbcf829c9fa',
'channel_id': 'b23e673f-c287-4cb1-8344-aae9005a69f8',
'average_rating': int,
'uploader_id': '38377323-6a23-41e2-9ff6-a8e8004bf6f7',
'uploader': 'LTS CLI Script',
'timestamp': 1572458134,
'title': 'WW2 Vets Interview 3 Ronald Stanley George',
'thumbnail': r're:https://unisa\.au\.panopto\.com/.+',
'channel': 'World War II Veteran Interviews',
'upload_date': '20191030',
},
'skip': 'Invalid URL',
}, {
# Does not allow normal Viewer.aspx. AUDIO livestream has no url, so should be skipped and only give one stream.
'url': 'https://unisa.au.panopto.com/Panopto/Pages/Embed.aspx?id=9d9a0fa3-e99a-4ebd-a281-aac2017f4da4',
'info_dict': {
'id': '9d9a0fa3-e99a-4ebd-a281-aac2017f4da4',
'ext': 'mp4',
'cast': ['LTS CLI Script'],
'chapters': [],
'duration': 2178.45,
'description': 'md5:ee5cf653919f55b72bce2dbcf829c9fa',
'channel_id': 'b23e673f-c287-4cb1-8344-aae9005a69f8',
'average_rating': int,
'uploader_id': '38377323-6a23-41e2-9ff6-a8e8004bf6f7',
'uploader': 'LTS CLI Script',
'tags': [],
'timestamp': 1572458134,
'title': 'WW2 Vets Interview 3 Ronald Stanley George',
'thumbnail': r're:https?://unisa\.au\.panopto\.com/.+',
'channel': 'World War II Veteran Interviews',
'upload_date': '20191030',
},
{
# Slides/storyboard
'url': 'https://demo.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=a7f12f1d-3872-4310-84b0-f8d8ab15326b',
'info_dict': {
'id': 'a7f12f1d-3872-4310-84b0-f8d8ab15326b',
'ext': 'mhtml',
'timestamp': 1448798857,
'duration': 4712.681,
'title': 'Cache Memory - CompSci 15-213, Lecture 12',
'channel_id': 'e4c6a2fc-1214-4ca0-8fb7-aef2e29ff63a',
'uploader_id': 'a96d1a31-b4de-489b-9eee-b4a5b414372c',
'upload_date': '20151129',
'average_rating': 0,
'uploader': 'Panopto Support',
'channel': 'Showcase Videos',
'description': 'md5:55e51d54233ddb0e6c2ed388ca73822c',
'cast': ['ISR Videographer', 'Panopto Support'],
'chapters': 'count:28',
'thumbnail': r're:https://demo\.hosted\.panopto\.com/.+',
},
'params': {'format': 'mhtml', 'skip_download': True},
}, {
# Slides/storyboard
'url': 'https://demo.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=a7f12f1d-3872-4310-84b0-f8d8ab15326b',
'info_dict': {
'id': 'a7f12f1d-3872-4310-84b0-f8d8ab15326b',
'ext': 'mhtml',
'timestamp': 1448798857,
'duration': 4712.681,
'title': 'Cache Memory - CompSci 15-213, Lecture 12',
'channel_id': '0202d932-6d28-4fb2-b373-af6f0121c8f0',
'uploader_id': 'a96d1a31-b4de-489b-9eee-b4a5b414372c',
'upload_date': '20151129',
'average_rating': 0,
'uploader': 'Panopto Support',
'channel': 'Customer Demonstrations',
'description': 'md5:55e51d54233ddb0e6c2ed388ca73822c',
'cast': ['ISR Videographer', 'Panopto Support'],
'chapters': 'count:28',
'tags': [],
'thumbnail': r're:https?://demo\.hosted\.panopto\.com/.+',
},
{
'url': 'https://na-training-1.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=8285224a-9a2b-4957-84f2-acb0000c4ea9',
'info_dict': {
'id': '8285224a-9a2b-4957-84f2-acb0000c4ea9',
'ext': 'mp4',
'chapters': [],
'title': 'Company Policy',
'average_rating': 0,
'timestamp': 1615058901,
'channel': 'Human Resources',
'tags': ['HumanResources'],
'duration': 1604.243,
'thumbnail': r're:https://na-training-1\.hosted\.panopto\.com/.+',
'uploader_id': '8e8ba0a3-424f-40df-a4f1-ab3a01375103',
'uploader': 'Cait M.',
'upload_date': '20210306',
'cast': ['Cait M.'],
'subtitles': {'en-US': [{'ext': 'srt', 'data': 'md5:a3f4d25963fdeace838f327097c13265'}],
'es-ES': [{'ext': 'srt', 'data': 'md5:57e9dad365fd0fbaf0468eac4949f189'}]},
},
'params': {'writesubtitles': True, 'skip_download': True},
}, {
# On Panopto there are two subs: "Default" and en-US. en-US is blank and should be skipped.
'url': 'https://na-training-1.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=940cbd41-f616-4a45-b13e-aaf1000c915b',
'info_dict': {
'id': '940cbd41-f616-4a45-b13e-aaf1000c915b',
'ext': 'mp4',
'subtitles': 'count:1',
'title': 'HR Benefits Review Meeting*',
'cast': ['Panopto Support'],
'chapters': [],
'timestamp': 1575024251,
'thumbnail': r're:https://na-training-1\.hosted\.panopto\.com/.+',
'channel': 'Zoom',
'description': 'md5:04f90a9c2c68b7828144abfb170f0106',
'uploader': 'Panopto Support',
'average_rating': 0,
'duration': 409.34499999999997,
'uploader_id': 'b6ac04ad-38b8-4724-a004-a851004ea3df',
'upload_date': '20191129',
'params': {'format': 'mhtml', 'skip_download': True},
}, {
'url': 'https://na-training-1.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=8285224a-9a2b-4957-84f2-acb0000c4ea9',
'info_dict': {
'id': '8285224a-9a2b-4957-84f2-acb0000c4ea9',
'ext': 'mp4',
'chapters': [],
'title': 'Company Policy',
'average_rating': 0,
'timestamp': 1615058901,
'channel': 'Human Resources',
'tags': ['HumanResources'],
'duration': 1604.243,
'thumbnail': r're:https?://na-training-1\.hosted\.panopto\.com/.+',
'uploader_id': '8e8ba0a3-424f-40df-a4f1-ab3a01375103',
'uploader': 'Cait M.',
'upload_date': '20210306',
'cast': ['Cait M.'],
},
'params': {'writesubtitles': True, 'skip_download': True},
}, {
# On Panopto there are two subs: "Default" and en-US. en-US is blank and should be skipped.
'url': 'https://na-training-1.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=940cbd41-f616-4a45-b13e-aaf1000c915b',
'info_dict': {
'id': '940cbd41-f616-4a45-b13e-aaf1000c915b',
'ext': 'mp4',
'subtitles': 'count:1',
'title': 'HR Benefits Review Meeting*',
'cast': ['Panopto Support'],
'chapters': [],
'timestamp': 1575024251,
'thumbnail': r're:https://na-training-1\.hosted\.panopto\.com/.+',
'channel': 'Zoom',
'description': 'md5:04f90a9c2c68b7828144abfb170f0106',
'uploader': 'Panopto Support',
'average_rating': 0,
'duration': 409.34499999999997,
'tags': [],
'uploader_id': 'b6ac04ad-38b8-4724-a004-a851004ea3df',
'upload_date': '20191129',
},
'params': {'writesubtitles': True, 'skip_download': True},
},
{
'url': 'https://ucc.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=0e8484a4-4ceb-4d98-a63f-ac0200b455cb',
'only_matching': True,
'params': {'writesubtitles': True, 'skip_download': True},
}, {
'url': 'https://ucc.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=0e8484a4-4ceb-4d98-a63f-ac0200b455cb',
'only_matching': True,
}, {
'url': 'https://brown.hosted.panopto.com/Panopto/Pages/Embed.aspx?id=0b3ff73b-36a0-46c5-8455-aadf010a3638',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.monash.edu/learning-teaching/teachhq/learning-technologies/panopto/how-to/insert-a-quiz-into-a-panopto-video',
'info_dict': {
'id': '0bd3f16c-824a-436a-8486-ac5900693aef',
'ext': 'mp4',
'title': 'Quizzes in Panopto',
'average_rating': 0,
'cast': ['Stephanie Luo'],
'chapters': 'count:8',
'channel': 'Panopto',
'description': 'md5:731ce802eee75808b1181db1ff1b5002',
'duration': 185.833,
'tags': [],
'thumbnail': r're:https?://monash\.au\.panopto\.com/.+',
'timestamp': 1607562188,
'upload_date': '20201210',
'uploader': 'Stephanie Luo',
'uploader_id': 'b18ca46d-20df-4ff5-b0b3-aa7a00085617',
},
{
'url': 'https://brown.hosted.panopto.com/Panopto/Pages/Embed.aspx?id=0b3ff73b-36a0-46c5-8455-aadf010a3638',
'only_matching': True,
},
]
'params': {'extractor_args': {'generic': {'impersonate': ['chrome']}}},
}]
@classmethod
def suitable(cls, url):
@@ -423,27 +439,23 @@ class PanoptoIE(PanoptoBaseIE):
class PanoptoPlaylistIE(PanoptoBaseIE):
_VALID_URL = PanoptoBaseIE.BASE_URL_RE + r'/Pages/(Viewer|Embed)\.aspx.*(?:\?|&)pid=(?P<id>[a-f0-9-]+)'
_TESTS = [
{
'url': 'https://howtovideos.hosted.panopto.com/Panopto/Pages/Viewer.aspx?pid=f3b39fcf-882f-4849-93d6-a9f401236d36&id=5fa74e93-3d87-4694-b60e-aaa4012214ed&advance=true',
'info_dict': {
'title': 'Featured Video Tutorials',
'id': 'f3b39fcf-882f-4849-93d6-a9f401236d36',
'description': '',
},
'playlist_mincount': 36,
_TESTS = [{
'url': 'https://howtovideos.hosted.panopto.com/Panopto/Pages/Viewer.aspx?pid=f3b39fcf-882f-4849-93d6-a9f401236d36&id=5fa74e93-3d87-4694-b60e-aaa4012214ed&advance=true',
'info_dict': {
'id': 'f3b39fcf-882f-4849-93d6-a9f401236d36',
'title': 'Featured Video Tutorials',
'description': '',
},
{
'url': 'https://utsa.hosted.panopto.com/Panopto/Pages/Viewer.aspx?pid=e2900555-3ad4-4bdb-854d-ad2401686190',
'info_dict': {
'title': 'Library Website Introduction Playlist',
'id': 'e2900555-3ad4-4bdb-854d-ad2401686190',
'description': 'md5:f958bca50a1cbda15fdc1e20d32b3ecb',
},
'playlist_mincount': 4,
'playlist_mincount': 19,
}, {
'url': 'https://utsa.hosted.panopto.com/Panopto/Pages/Viewer.aspx?pid=e2900555-3ad4-4bdb-854d-ad2401686190',
'info_dict': {
'id': 'e2900555-3ad4-4bdb-854d-ad2401686190',
'title': 'Library Website Introduction Playlist',
'description': 'md5:f958bca50a1cbda15fdc1e20d32b3ecb',
},
]
'playlist_mincount': 4,
}]
def _entries(self, base_url, playlist_id, session_list_id):
session_list_info = self._call_api(
@@ -486,35 +498,29 @@ class PanoptoPlaylistIE(PanoptoBaseIE):
class PanoptoListIE(PanoptoBaseIE):
_VALID_URL = PanoptoBaseIE.BASE_URL_RE + r'/Pages/Sessions/List\.aspx'
_PAGE_SIZE = 250
_TESTS = [
{
'url': 'https://demo.hosted.panopto.com/Panopto/Pages/Sessions/List.aspx#folderID=%22e4c6a2fc-1214-4ca0-8fb7-aef2e29ff63a%22',
'info_dict': {
'id': 'e4c6a2fc-1214-4ca0-8fb7-aef2e29ff63a',
'title': 'Showcase Videos',
},
'playlist_mincount': 140,
_TESTS = [{
'url': 'https://demo.hosted.panopto.com/Panopto/Pages/Sessions/List.aspx#folderID=%22e4c6a2fc-1214-4ca0-8fb7-aef2e29ff63a%22',
'info_dict': {
'id': 'e4c6a2fc-1214-4ca0-8fb7-aef2e29ff63a',
'title': 'Showcase Videos',
},
{
'url': 'https://demo.hosted.panopto.com/Panopto/Pages/Sessions/List.aspx#view=2&maxResults=250',
'info_dict': {
'id': 'panopto_list',
'title': 'panopto_list',
},
'playlist_mincount': 300,
'playlist_mincount': 8,
}, {
'url': 'https://demo.hosted.panopto.com/Panopto/Pages/Sessions/List.aspx#view=2&maxResults=250',
'info_dict': {
'id': 'panopto_list',
'title': 'panopto_list',
},
{
# Folder that contains 8 folders and a playlist
'url': 'https://howtovideos.hosted.panopto.com/Panopto/Pages/Sessions/List.aspx?noredirect=true#folderID=%224b9de7ae-0080-4158-8496-a9ba01692c2e%22',
'info_dict': {
'id': '4b9de7ae-0080-4158-8496-a9ba01692c2e',
'title': 'Video Tutorials',
},
'playlist_mincount': 9,
'playlist_mincount': 300,
}, {
# Folder that contains 8 folders and a playlist
'url': 'https://howtovideos.hosted.panopto.com/Panopto/Pages/Sessions/List.aspx?noredirect=true#folderID=%224b9de7ae-0080-4158-8496-a9ba01692c2e%22',
'info_dict': {
'id': '4b9de7ae-0080-4158-8496-a9ba01692c2e',
'title': 'Video Tutorials',
},
]
'playlist_mincount': 9,
}]
def _fetch_page(self, base_url, query_params, display_id, page):

View File

@@ -1,63 +1,63 @@
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
try_get,
unified_timestamp,
)
from ..utils import parse_duration, parse_iso8601, url_or_none
from ..utils.traversal import traverse_obj
class ParlviewIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'https?://(?:www\.)?parlview\.aph\.gov\.au/(?:[^/]+)?\bvideoID=(?P<id>\d{6})'
_VALID_URL = r'https?://(?:www\.)?aph\.gov\.au/News_and_Events/Watch_Read_Listen/ParlView/video/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://parlview.aph.gov.au/mediaPlayer.php?videoID=542661',
'url': 'https://www.aph.gov.au/News_and_Events/Watch_Read_Listen/ParlView/video/3406614',
'info_dict': {
'id': '542661',
'id': '3406614',
'ext': 'mp4',
'title': "Australia's Family Law System [Part 2]",
'duration': 5799,
'description': 'md5:7099883b391619dbae435891ca871a62',
'timestamp': 1621430700,
'upload_date': '20210519',
'uploader': 'Joint Committee',
'title': 'Senate Chamber',
'description': 'Official Recording of Senate Proceedings from the Australian Parliament',
'thumbnail': 'https://aphbroadcasting-prod.z01.azurefd.net/vod-storage/vod-logos/SenateParlview06.jpg',
'upload_date': '20250325',
'duration': 17999,
'timestamp': 1742939400,
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://parlview.aph.gov.au/mediaPlayer.php?videoID=539936',
'only_matching': True,
'url': 'https://www.aph.gov.au/News_and_Events/Watch_Read_Listen/ParlView/video/SV1394.dv',
'info_dict': {
'id': 'SV1394.dv',
'ext': 'mp4',
'title': 'Senate Select Committee on Uranium Mining and Milling [Part 1]',
'description': 'Official Recording of Senate Committee Proceedings from the Australian Parliament',
'thumbnail': 'https://aphbroadcasting-prod.z01.azurefd.net/vod-storage/vod-logos/CommitteeThumbnail06.jpg',
'upload_date': '19960822',
'duration': 14765,
'timestamp': 840754200,
},
'params': {
'skip_download': True,
},
}]
_API_URL = 'https://parlview.aph.gov.au/api_v3/1/playback/getUniversalPlayerConfig?videoID=%s&format=json'
_MEDIA_INFO_URL = 'https://parlview.aph.gov.au/ajaxPlayer.php?videoID=%s&tabNum=4&action=loadTab'
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
media = self._download_json(self._API_URL % video_id, video_id).get('media')
timestamp = try_get(media, lambda x: x['timeMap']['source']['timecode_offsets'][0], str) or '/'
video_details = self._download_json(
f'https://vodapi.aph.gov.au/api/search/parlview/{video_id}', video_id)['videoDetails']
stream = try_get(media, lambda x: x['renditions'][0], dict)
if not stream:
self.raise_no_formats('No streams were detected')
elif stream.get('streamType') != 'VOD':
self.raise_no_formats('Unknown type of stream was detected: "{}"'.format(str(stream.get('streamType'))))
formats = self._extract_m3u8_formats(stream['url'], video_id, 'mp4', 'm3u8_native')
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
video_details['files']['file']['url'], video_id, 'mp4')
media_info = self._download_webpage(
self._MEDIA_INFO_URL % video_id, video_id, note='Downloading media info', fatal=False)
DURATION_RE = re.compile(r'(?P<duration>\d+:\d+:\d+):\d+')
return {
'id': video_id,
'url': url,
'title': self._html_search_regex(r'<h2>([^<]+)<', webpage, 'title', fatal=False),
'formats': formats,
'duration': int_or_none(media.get('duration')),
'timestamp': unified_timestamp(timestamp.split('/', 1)[1].replace('_', ' ')),
'description': self._html_search_regex(
r'<div[^>]+class="descripti?on"[^>]*>[^>]+<strong>[^>]+>[^>]+>([^<]+)',
webpage, 'description', fatal=False),
'uploader': self._html_search_regex(
r'<td>[^>]+>Channel:[^>]+>([^<]+)', media_info, 'channel', fatal=False),
'thumbnail': media.get('staticImage'),
'subtitles': subtitles,
**traverse_obj(video_details, {
'title': (('parlViewTitle', 'title'), {str}, any),
'description': ('parlViewDescription', {str}),
'duration': ('files', 'file', 'duration', {DURATION_RE.fullmatch}, 'duration', {parse_duration}),
'timestamp': ('recordingFrom', {parse_iso8601}),
'thumbnail': ('thumbUrl', {url_or_none}),
}),
}

View File

@@ -1331,7 +1331,7 @@ class PeerTubeIE(InfoExtractor):
'ext': 'mp4',
'title': 'What is PeerTube?',
'description': 'md5:3fefb8dde2b189186ce0719fda6f7b10',
'thumbnail': r're:https?://.*\.(?:jpg|png)',
'thumbnail': r're:https?://framatube\.org/lazy-static/thumbnails/.+\.jpg',
'timestamp': 1538391166,
'upload_date': '20181001',
'uploader': 'Framasoft',
@@ -1346,19 +1346,34 @@ class PeerTubeIE(InfoExtractor):
'view_count': int,
'like_count': int,
'dislike_count': int,
'tags': ['framasoft', 'peertube'],
'tags': 'count:2',
'categories': ['Science & Technology'],
},
'expected_warnings': ['HTTP Error 400: Bad Request'],
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://peertube2.cpy.re/w/122d093a-1ede-43bd-bd34-59d2931ffc5e',
'info_dict': {
'id': '122d093a-1ede-43bd-bd34-59d2931ffc5e',
'ext': 'mp4',
'title': 'E2E tests',
'uploader_id': '37855',
'categories': ['Unknown'],
'channel': 'Main chocobozzz channel',
'channel_id': '5187',
'channel_url': 'https://peertube2.cpy.re/video-channels/chocobozzz_channel',
'description': 'md5:67daf92c833c41c95db874e18fcb2786',
'dislike_count': int,
'duration': 52,
'license': 'Unknown',
'like_count': int,
'tags': [],
'thumbnail': r're:https?://peertube2\.cpy\.re/lazy-static/thumbnails/.+\.jpg',
'timestamp': 1589276219,
'upload_date': '20200512',
'uploader': 'chocobozzz',
'uploader_id': '37855',
'uploader_url': 'https://peertube2.cpy.re/accounts/chocobozzz',
'view_count': int,
},
}, {
'url': 'https://peertube2.cpy.re/w/3fbif9S3WmtTP8gGsC5HBd',
@@ -1366,10 +1381,23 @@ class PeerTubeIE(InfoExtractor):
'id': '3fbif9S3WmtTP8gGsC5HBd',
'ext': 'mp4',
'title': 'E2E tests',
'uploader_id': '37855',
'categories': ['Unknown'],
'channel': 'Main chocobozzz channel',
'channel_id': '5187',
'channel_url': 'https://peertube2.cpy.re/video-channels/chocobozzz_channel',
'description': 'md5:67daf92c833c41c95db874e18fcb2786',
'dislike_count': int,
'duration': 52,
'license': 'Unknown',
'like_count': int,
'tags': [],
'thumbnail': r're:https?://peertube2\.cpy\.re/lazy-static/thumbnails/.+\.jpg',
'timestamp': 1589276219,
'upload_date': '20200512',
'uploader': 'chocobozzz',
'uploader_id': '37855',
'uploader_url': 'https://peertube2.cpy.re/accounts/chocobozzz',
'view_count': int,
},
}, {
'url': 'https://peertube2.cpy.re/api/v1/videos/3fbif9S3WmtTP8gGsC5HBd',
@@ -1377,13 +1405,26 @@ class PeerTubeIE(InfoExtractor):
'id': '3fbif9S3WmtTP8gGsC5HBd',
'ext': 'mp4',
'title': 'E2E tests',
'uploader_id': '37855',
'categories': ['Unknown'],
'channel': 'Main chocobozzz channel',
'channel_id': '5187',
'channel_url': 'https://peertube2.cpy.re/video-channels/chocobozzz_channel',
'description': 'md5:67daf92c833c41c95db874e18fcb2786',
'dislike_count': int,
'duration': 52,
'license': 'Unknown',
'like_count': int,
'tags': [],
'thumbnail': r're:https?://peertube2\.cpy\.re/lazy-static/thumbnails/.+\.jpg',
'timestamp': 1589276219,
'upload_date': '20200512',
'uploader': 'chocobozzz',
'uploader_id': '37855',
'uploader_url': 'https://peertube2.cpy.re/accounts/chocobozzz',
'view_count': int,
},
}, {
# Issue #26002
# https://github.com/ytdl-org/youtube-dl/issues/26002
'url': 'peertube:spacepub.space:d8943b2d-8280-497b-85ec-bc282ec2afdc',
'info_dict': {
'id': 'd8943b2d-8280-497b-85ec-bc282ec2afdc',
@@ -1394,6 +1435,7 @@ class PeerTubeIE(InfoExtractor):
'upload_date': '20200420',
'uploader': 'Drew DeVault',
},
'skip': 'Invalid URL',
}, {
'url': 'https://peertube.debian.social/videos/watch/0b04f13d-1e18-4f1d-814e-4979aa7c9c44',
'only_matching': True,
@@ -1411,6 +1453,33 @@ class PeerTubeIE(InfoExtractor):
'url': 'peertube:framatube.org:b37a5b9f-e6b5-415c-b700-04a5cd6ec205',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://video.macver.org/w/6gvhZpUGQVd4SQ6oYDc9pC',
'info_dict': {
'id': '6gvhZpUGQVd4SQ6oYDc9pC',
'ext': 'mp4',
'title': 'Minecraft, but if you say a block, it gets deleted',
'categories': ['Gaming'],
'channel': 'Waffle Irons Gaming',
'channel_id': '4',
'channel_url': 'https://video.macver.org/video-channels/waffle_irons',
'description': 'md5:eda8daf64b0dadd00cc248f28eef213c',
'dislike_count': int,
'duration': 1643,
'license': 'Attribution - Non Commercial',
'like_count': int,
'tags': 'count:1',
'thumbnail': r're:https?://video\.macver\.org/lazy-static/thumbnails/.+\.jpg',
'timestamp': 1751142352,
'upload_date': '20250628',
'uploader': 'Bog',
'uploader_id': '3',
'uploader_url': 'https://video.macver.org/accounts/bog',
'view_count': int,
},
'expected_warnings': ['HTTP Error 400: Bad Request', 'Ignoring subtitle tracks found in the HLS manifest'],
'params': {'skip_download': 'm3u8'},
}]
@staticmethod
def _extract_peertube_url(webpage, source_url):
@@ -1580,31 +1649,47 @@ class PeerTubePlaylistIE(InfoExtractor):
'id': 'hFdJoTuyhNJVa1cDWd1d12',
'description': 'Diversas palestras do Richard Stallman no Brasil.',
'title': 'Richard Stallman no Brasil',
'channel': 'debianbrazilteam',
'channel_id': 1522,
'thumbnail': r're:https?://peertube\.debian\.social/lazy-static/thumbnails/.+\.jpg',
'timestamp': 1599676222,
'upload_date': '20200909',
},
'playlist_mincount': 9,
}, {
'url': 'https://peertube2.cpy.re/a/chocobozzz/videos',
'info_dict': {
'id': 'chocobozzz',
'timestamp': 1553874564,
'title': 'chocobozzz',
'channel': 'chocobozzz',
'channel_id': 37855,
'thumbnail': '',
'timestamp': 1553874564,
'upload_date': '20190329',
},
'playlist_mincount': 2,
}, {
'url': 'https://framatube.org/c/bf54d359-cfad-4935-9d45-9d6be93f63e8/videos',
'info_dict': {
'id': 'bf54d359-cfad-4935-9d45-9d6be93f63e8',
'timestamp': 1519917377,
'title': 'Les vidéos de Framasoft',
'channel': 'framasoft',
'channel_id': 3,
'thumbnail': '',
'timestamp': 1519917377,
'upload_date': '20180301',
},
'playlist_mincount': 345,
}, {
'url': 'https://peertube2.cpy.re/c/blender_open_movies@video.blender.org/videos',
'info_dict': {
'id': 'blender_open_movies@video.blender.org',
'timestamp': 1542287810,
'title': 'Official Blender Open Movies',
'channel': 'blender',
'channel_id': 1926,
'thumbnail': '',
'timestamp': 1540472902,
'upload_date': '20181025',
},
'playlist_mincount': 11,
}]

View File

@@ -1,135 +0,0 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
determine_ext,
int_or_none,
parse_qs,
qualities,
xpath_text,
)
class PladformIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
(?:
out\.pladform\.ru/player|
static\.pladform\.ru/player\.swf
)
\?.*\bvideoid=|
video\.pladform\.ru/catalog/video/videoid/
)
(?P<id>\d+)
'''
_EMBED_REGEX = [r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//out\.pladform\.ru/player\?.+?)\1']
_TESTS = [{
'url': 'http://out.pladform.ru/player?pl=18079&type=html5&videoid=100231282',
'info_dict': {
'id': '6216d548e755edae6e8280667d774791',
'ext': 'mp4',
'timestamp': 1406117012,
'title': 'Гарик Мартиросян и Гарик Харламов - Кастинг на концерт ко Дню милиции',
'age_limit': 0,
'upload_date': '20140723',
'thumbnail': str,
'view_count': int,
'description': str,
'uploader_id': '12082',
'uploader': 'Comedy Club',
'duration': 367,
},
'expected_warnings': ['HTTP Error 404: Not Found'],
}, {
'url': 'https://out.pladform.ru/player?pl=64471&videoid=3777899&vk_puid15=0&vk_puid34=0',
'md5': '53362fac3a27352da20fa2803cc5cd6f',
'info_dict': {
'id': '3777899',
'ext': 'mp4',
'title': 'СТУДИЯ СОЮЗ • Шоу Студия Союз, 24 выпуск (01.02.2018) Нурлан Сабуров и Слава Комиссаренко',
'description': 'md5:05140e8bf1b7e2d46e7ba140be57fd95',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 3190,
},
}, {
'url': 'http://static.pladform.ru/player.swf?pl=21469&videoid=100183293&vkcid=0',
'only_matching': True,
}, {
'url': 'http://video.pladform.ru/catalog/video/videoid/100183293/vkcid/0',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
qs = parse_qs(url)
pl = qs.get('pl', ['1'])[0]
video = self._download_xml(
'http://out.pladform.ru/getVideo', video_id, query={
'pl': pl,
'videoid': video_id,
}, fatal=False)
def fail(text):
raise ExtractorError(
f'{self.IE_NAME} returned error: {text}',
expected=True)
if not video:
target_url = self._request_webpage(url, video_id, note='Resolving final URL').url
if target_url == url:
raise ExtractorError('Can\'t parse page')
return self.url_result(target_url)
if video.tag == 'error':
fail(video.text)
quality = qualities(('ld', 'sd', 'hd'))
formats = []
for src in video.findall('./src'):
if src is None:
continue
format_url = src.text
if not format_url:
continue
if src.get('type') == 'hls' or determine_ext(format_url) == 'm3u8':
formats.extend(self._extract_m3u8_formats(
format_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
else:
formats.append({
'url': src.text,
'format_id': src.get('quality'),
'quality': quality(src.get('quality')),
})
if not formats:
error = xpath_text(video, './cap', 'error', default=None)
if error:
fail(error)
webpage = self._download_webpage(
f'http://video.pladform.ru/catalog/video/videoid/{video_id}',
video_id)
title = self._og_search_title(webpage, fatal=False) or xpath_text(
video, './/title', 'title', fatal=True)
description = self._search_regex(
r'</h3>\s*<p>([^<]+)</p>', webpage, 'description', fatal=False)
thumbnail = self._og_search_thumbnail(webpage) or xpath_text(
video, './/cover', 'cover')
duration = int_or_none(xpath_text(video, './/time', 'duration'))
age_limit = int_or_none(xpath_text(video, './/age18', 'age limit'))
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'age_limit': age_limit,
'formats': formats,
}

View File

@@ -19,6 +19,7 @@ class PlaywireIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.png$',
'duration': 145.94,
},
'skip': 'Invalid URL',
}, {
# m3u8 in f4m
'url': 'http://config.playwire.com/21772/videos/v2/4840492/zeus.json',
@@ -27,10 +28,7 @@ class PlaywireIE(InfoExtractor):
'ext': 'mp4',
'title': 'ITV EL SHOW FULL',
},
'params': {
# m3u8 download
'skip_download': True,
},
'skip': 'Invalid URL',
}, {
# Multiple resolutions while bitrates missing
'url': 'http://cdn.playwire.com/11625/embed/85228.html',
@@ -42,6 +40,15 @@ class PlaywireIE(InfoExtractor):
'url': 'http://cdn.playwire.com/v2/12342/config/1532636.json',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.cinemablend.com/new/First-Joe-Dirt-2-Trailer-Teaser-Stupid-Greatness-70874.html',
'info_dict': {
'id': '3519514',
'ext': 'mp4',
'title': 'Joe Dirt 2 Beautiful Loser Teaser Trailer',
},
'skip': 'Site no longer embeds Playwire',
}]
def _real_extract(self, url):
mobj = self._match_valid_url(url)

104
yt_dlp/extractor/plyr.py Normal file
View File

@@ -0,0 +1,104 @@
import re
from .common import InfoExtractor
from .vimeo import VimeoIE
class PlyrEmbedIE(InfoExtractor):
_VALID_URL = False
_WEBPAGE_TESTS = [{
# data-plyr-embed-id="https://player.vimeo.com/video/522319456/90e5c96063?dnt=1"
'url': 'https://www.dhm.de/zeughauskino/filmreihen/online-filmreihen/filme-des-marshall-plans/200000000-mouths/',
'info_dict': {
'id': '522319456',
'ext': 'mp4',
'title': '200.000.000 Mouths (195051)',
'uploader': 'Zeughauskino',
'uploader_url': '',
'comment_count': int,
'like_count': int,
'duration': 963,
'thumbnail': 'https://i.vimeocdn.com/video/1081797161-9f09ddb4b7faa86e834e006b8e4b9c2cbaa0baa7da493211bf0796ae133a5ab8-d',
'timestamp': 1615467405,
'upload_date': '20210311',
'release_timestamp': 1615467405,
'release_date': '20210311',
},
'params': {'skip_download': 'm3u8'},
'expected_warnings': ['Failed to parse XML: not well-formed'],
}, {
# data-plyr-provider="vimeo" data-plyr-embed-id="803435276"
'url': 'https://www.inarcassa.it/',
'info_dict': {
'id': '803435276',
'ext': 'mp4',
'title': 'HOME_Moto_Perpetuo',
'uploader': 'Inarcassa',
'uploader_url': '',
'duration': 38,
'thumbnail': 'https://i.vimeocdn.com/video/1663734769-945ad7ffabb16dbca009c023fd1d7b36bdb426a3dbae8345ed758136fe28f89a-d',
},
'params': {'skip_download': 'm3u8'},
'expected_warnings': ['Failed to parse XML: not well-formed'],
}, {
# data-plyr-embed-id="https://youtu.be/GF-BjYKoAqI"
'url': 'https://www.profile.nl',
'info_dict': {
'id': 'GF-BjYKoAqI',
'ext': 'mp4',
'title': 'PROFILE: Recruitment Profile',
'description': '',
'media_type': 'video',
'uploader': 'Profile Nederland',
'uploader_id': '@profilenederland',
'uploader_url': 'https://www.youtube.com/@profilenederland',
'channel': 'Profile Nederland',
'channel_id': 'UC9AUkB0Tv39-TBYjs05n3vg',
'channel_url': 'https://www.youtube.com/channel/UC9AUkB0Tv39-TBYjs05n3vg',
'channel_follower_count': int,
'view_count': int,
'like_count': int,
'age_limit': 0,
'duration': 39,
'thumbnail': 'https://i.ytimg.com/vi/GF-BjYKoAqI/maxresdefault.jpg',
'categories': ['Autos & Vehicles'],
'tags': [],
'timestamp': 1675692990,
'upload_date': '20230206',
'playable_in_embed': True,
'availability': 'public',
'live_status': 'not_live',
},
}, {
# data-plyr-embed-id="B1TZV8rNZoc" data-plyr-provider="youtube"
'url': 'https://www.vnis.edu.vn',
'info_dict': {
'id': 'vnis.edu',
'title': 'VNIS Education - Master Agent các Trường hàng đầu Bắc Mỹ',
'description': 'md5:4dafcf7335bb018780e4426da8ab8e4e',
'age_limit': 0,
'thumbnail': 'https://vnis.edu.vn/wp-content/uploads/2021/05/ve-welcome-en.png',
'timestamp': 1753233356,
'upload_date': '20250723',
},
'playlist_count': 3,
}]
@classmethod
def _extract_embed_urls(cls, url, webpage):
plyr_embeds = re.finditer(r'''(?x)
<div[^>]+(?:
data-plyr-embed-id="(?P<id1>[^"]+)"[^>]+data-plyr-provider="(?P<provider1>[^"]+)"|
data-plyr-provider="(?P<provider2>[^"]+)"[^>]+data-plyr-embed-id="(?P<id2>[^"]+)"
)[^>]*>''', webpage)
for mobj in plyr_embeds:
embed_id = mobj.group('id1') or mobj.group('id2')
provider = mobj.group('provider1') or mobj.group('provider2')
if provider == 'vimeo':
if not re.match(r'https?://', embed_id):
embed_id = f'https://player.vimeo.com/video/{embed_id}'
yield VimeoIE._smuggle_referrer(embed_id, url)
elif provider == 'youtube':
if not re.match(r'https?://', embed_id):
embed_id = f'https://youtube.com/watch?v={embed_id}'
yield embed_id

View File

@@ -6,6 +6,7 @@ from ..utils import (
int_or_none,
parse_resolution,
str_or_none,
traverse_obj,
try_get,
unified_timestamp,
url_or_none,
@@ -18,20 +19,21 @@ class PuhuTVIE(InfoExtractor):
IE_NAME = 'puhutv'
_TESTS = [{
# film
'url': 'https://puhutv.com/sut-kardesler-izle',
'md5': 'a347470371d56e1585d1b2c8dab01c96',
'url': 'https://puhutv.com/bi-kucuk-eylul-meselesi-izle',
'md5': '4de98170ccb84c05779b1f046b3c86f8',
'info_dict': {
'id': '5085',
'display_id': 'sut-kardesler',
'id': '11909',
'display_id': 'bi-kucuk-eylul-meselesi',
'ext': 'mp4',
'title': 'Süt Kardeşler',
'description': 'md5:ca09da25b7e57cbb5a9280d6e48d17aa',
'title': 'Bi Küçük Eylül Meselesi',
'description': 'md5:c2ab964c6542b7b26acacca0773adce2',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 4832.44,
'creator': 'Arzu Film',
'timestamp': 1561062602,
'duration': 6176.96,
'creator': 'Ay Yapım',
'creators': ['Ay Yapım'],
'timestamp': 1561062749,
'upload_date': '20190620',
'release_year': 1976,
'release_year': 2014,
'view_count': int,
'tags': list,
},
@@ -181,7 +183,7 @@ class PuhuTVSerieIE(InfoExtractor):
'playlist_mincount': 205,
}, {
# a film detail page which is using same url with serie page
'url': 'https://puhutv.com/kaybedenler-kulubu-detay',
'url': 'https://puhutv.com/bizim-icin-sampiyon-detay',
'only_matching': True,
}]
@@ -194,24 +196,19 @@ class PuhuTVSerieIE(InfoExtractor):
has_more = True
while has_more is True:
season = self._download_json(
f'https://galadriel.puhutv.com/seasons/{season_id}',
f'https://appservice.puhutv.com/api/seasons/{season_id}/episodes?v=2',
season_id, f'Downloading page {page}', query={
'page': page,
'per': 40,
})
episodes = season.get('episodes')
if isinstance(episodes, list):
for ep in episodes:
slug_path = str_or_none(ep.get('slugPath'))
if not slug_path:
continue
video_id = str_or_none(int_or_none(ep.get('id')))
yield self.url_result(
f'https://puhutv.com/{slug_path}',
ie=PuhuTVIE.ie_key(), video_id=video_id,
video_title=ep.get('name') or ep.get('eventLabel'))
'per': 100,
})['data']
for episode in traverse_obj(season, ('episodes', lambda _, v: v.get('slug') or v['assets'][0]['slug'])):
slug = episode.get('slug') or episode['assets'][0]['slug']
yield self.url_result(
f'https://puhutv.com/{slug}', PuhuTVIE, episode.get('id'), episode.get('name'))
page += 1
has_more = season.get('hasMore')
has_more = traverse_obj(season, 'has_more')
def _real_extract(self, url):
playlist_id = self._match_id(url)
@@ -226,7 +223,10 @@ class PuhuTVSerieIE(InfoExtractor):
self._extract_entries(seasons), playlist_id, info.get('name'))
# For films, these are using same url with series
video_id = info.get('slug') or info['assets'][0]['slug']
video_id = info.get('slug')
if video_id:
video_id = video_id.removesuffix('-detay')
else:
video_id = info['assets'][0]['slug'].removesuffix('-izle')
return self.url_result(
f'https://puhutv.com/{video_id}-izle',
PuhuTVIE.ie_key(), video_id)
f'https://puhutv.com/{video_id}-izle', PuhuTVIE, video_id)

View File

@@ -3,9 +3,9 @@ from ..utils.traversal import traverse_obj
class RoyaLiveIE(InfoExtractor):
_VALID_URL = r'https?://roya\.tv/live-stream/(?P<id>\d+)'
_VALID_URL = r'https?://(?:en\.)?roya\.tv/live-stream/(?P<id>\d+)'
_TESTS = [{
'url': 'https://roya.tv/live-stream/1',
'url': 'https://en.roya.tv/live-stream/1',
'info_dict': {
'id': '1',
'title': r're:Roya TV \d{4}-\d{2}-\d{2} \d{2}:\d{2}',

View File

@@ -6,9 +6,11 @@ import urllib.parse
from .common import InfoExtractor
from ..utils import (
ExtractorError,
InAdvancePagedList,
clean_html,
determine_ext,
float_or_none,
int_or_none,
make_archive_id,
parse_iso8601,
qualities,
@@ -371,3 +373,62 @@ class RTVETelevisionIE(InfoExtractor):
raise ExtractorError('The webpage doesn\'t contain any video', expected=True)
return self.url_result(play_url, ie=RTVEALaCartaIE.ie_key())
class RTVEProgramIE(RTVEBaseIE):
IE_NAME = 'rtve.es:program'
IE_DESC = 'RTVE.es programs'
_VALID_URL = r'https?://(?:www\.)?rtve\.es/play/videos/(?P<id>[\w-]+)/?(?:[?#]|$)'
_TESTS = [{
'url': 'https://www.rtve.es/play/videos/saber-vivir/',
'info_dict': {
'id': '111570',
'title': 'Saber vivir - Programa de ciencia y futuro en RTVE Play',
},
'playlist_mincount': 400,
}]
_PAGE_SIZE = 60
def _fetch_page(self, program_id, page_num):
return self._download_json(
f'https://www.rtve.es/api/programas/{program_id}/videos',
program_id, note=f'Downloading page {page_num}',
query={
'type': 39816,
'page': page_num,
'size': 60,
})
def _entries(self, page_data):
for video in traverse_obj(page_data, ('page', 'items', lambda _, v: url_or_none(v['htmlUrl']))):
yield self.url_result(
video['htmlUrl'], RTVEALaCartaIE, url_transparent=True,
**traverse_obj(video, {
'id': ('id', {str}),
'title': ('longTitle', {str}),
'description': ('shortDescription', {str}),
'duration': ('duration', {float_or_none(scale=1000)}),
'series': (('programInfo', 'title'), {str}, any),
'season_number': ('temporadaOrden', {int_or_none}),
'season_id': ('temporadaId', {str}),
'season': ('temporada', {str}),
'episode_number': ('episode', {int_or_none}),
'episode': ('title', {str}),
'thumbnail': ('thumbnail', {url_or_none}),
}),
)
def _real_extract(self, url):
program_slug = self._match_id(url)
program_page = self._download_webpage(url, program_slug)
program_id = self._html_search_meta('DC.identifier', program_page, 'Program ID', fatal=True)
first_page = self._fetch_page(program_id, 1)
page_count = traverse_obj(first_page, ('page', 'totalPages', {int})) or 1
entries = InAdvancePagedList(
lambda idx: self._entries(self._fetch_page(program_id, idx + 1) if idx else first_page),
page_count, self._PAGE_SIZE)
return self.playlist_result(entries, program_id, self._html_extract_title(program_page))

View File

@@ -115,7 +115,6 @@ class RutubeIE(RutubeBaseIE):
_TESTS = [{
'url': 'https://rutube.ru/video/3eac3b4561676c17df9132a9a1e62e3e/',
'md5': '3d73fdfe5bb81b9aef139e22ef3de26a',
'info_dict': {
'id': '3eac3b4561676c17df9132a9a1e62e3e',
'ext': 'mp4',
@@ -128,10 +127,11 @@ class RutubeIE(RutubeBaseIE):
'upload_date': '20131016',
'age_limit': 0,
'view_count': int,
'thumbnail': 'https://pic.rutubelist.ru/video/d2/a0/d2a0aec998494a396deafc7ba2c82add.jpg',
'thumbnail': r're:https?://pic\.rutubelist\.ru/video/.+\.(?:jpg|png)',
'categories': ['Новости и СМИ'],
'chapters': [],
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://rutube.ru/play/embed/a10e53b86e8f349080f718582ce4c661',
'only_matching': True,
@@ -146,7 +146,6 @@ class RutubeIE(RutubeBaseIE):
'only_matching': True,
}, {
'url': 'https://rutube.ru/video/private/884fb55f07a97ab673c7d654553e0f48/?p=x2QojCumHTS3rsKHWXN8Lg',
'md5': '4fce7b4fcc7b1bcaa3f45eb1e1ad0dd7',
'info_dict': {
'id': '884fb55f07a97ab673c7d654553e0f48',
'ext': 'mp4',
@@ -163,6 +162,7 @@ class RutubeIE(RutubeBaseIE):
'categories': ['Видеоигры'],
'chapters': [],
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://rutube.ru/video/c65b465ad0c98c89f3b25cb03dcc87c6/',
'info_dict': {
@@ -171,7 +171,7 @@ class RutubeIE(RutubeBaseIE):
'chapters': 'count:4',
'categories': ['Бизнес и предпринимательство'],
'description': 'md5:252feac1305257d8c1bab215cedde75d',
'thumbnail': 'https://pic.rutubelist.ru/video/71/8f/718f27425ea9706073eb80883dd3787b.png',
'thumbnail': r're:https?://pic\.rutubelist\.ru/video/.+\.(?:jpg|png)',
'duration': 782,
'age_limit': 0,
'uploader_id': '23491359',
@@ -181,6 +181,7 @@ class RutubeIE(RutubeBaseIE):
'title': 'Бизнес с нуля: найм сотрудников. Интервью с директором строительной компании #1',
'uploader': 'Стас Быков',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://rutube.ru/live/video/c58f502c7bb34a8fcdd976b221fca292/',
'info_dict': {
@@ -188,16 +189,17 @@ class RutubeIE(RutubeBaseIE):
'ext': 'mp4',
'categories': ['Телепередачи'],
'description': '',
'thumbnail': 'https://pic.rutubelist.ru/video/14/19/14190807c0c48b40361aca93ad0867c7.jpg',
'thumbnail': r're:https?://pic\.rutubelist\.ru/video/.+\.(?:jpg|png)',
'live_status': 'is_live',
'age_limit': 0,
'uploader_id': '23460655',
'timestamp': 1652972968,
'view_count': int,
'upload_date': '20220519',
'title': r're:Первый канал. Прямой эфир \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
'title': str,
'uploader': 'Первый канал',
},
'skip': 'Invalid URL',
}, {
'url': 'https://rutube.ru/play/embed/03a9cb54bac3376af4c5cb0f18444e01/',
'info_dict': {
@@ -211,11 +213,12 @@ class RutubeIE(RutubeBaseIE):
'duration': 293,
'uploader': 'MOEX - Московская биржа',
'timestamp': 1724946628,
'thumbnail': 'https://pic.rutubelist.ru/video/2e/24/2e241fddb459baf0fa54acfca44874f4.jpg',
'thumbnail': r're:https?://pic\.rutubelist\.ru/video/.+\.(?:jpg|png)',
'view_count': int,
'uploader_id': '38420507',
'categories': ['Интервью'],
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://rutube.ru/video/5ab908fccfac5bb43ef2b1e4182256b0/',
'only_matching': True,
@@ -223,6 +226,26 @@ class RutubeIE(RutubeBaseIE):
'url': 'https://rutube.ru/live/video/private/c58f502c7bb34a8fcdd976b221fca292/',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://novate.ru/blogs/170625/73644/',
'info_dict': {
'id': 'b0c96c75a4e5b274721bbced6ed8fb64',
'ext': 'mp4',
'title': 'Где в России находится единственная в своем роде скальная торпедная батарея',
'age_limit': 0,
'categories': ['Наука'],
'chapters': [],
'description': 'md5:2ed82e6b81958a43da6fb4d56f949e1f',
'duration': 182,
'thumbnail': r're:https?://pic\.rutubelist\.ru/video/.+\.(?:jpg|png)',
'timestamp': 1749950158,
'upload_date': '20250615',
'uploader': 'Novate',
'uploader_id': '24044809',
'view_count': int,
},
'params': {'skip_download': 'm3u8'},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -256,12 +279,10 @@ class RutubeEmbedIE(RutubeBaseIE):
'chapters': [],
'description': 'md5:a5acea57bbc3ccdc3cacd1f11a014b5b',
'view_count': int,
'thumbnail': 'https://pic.rutubelist.ru/video/d3/03/d3031f4670a6e6170d88fb3607948418.jpg',
'thumbnail': r're:https?://pic\.rutubelist\.ru/video/.+\.(?:jpg|png)',
'categories': ['Сериалы'],
},
'params': {
'skip_download': True,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://rutube.ru/play/embed/8083783',
'only_matching': True,

View File

@@ -16,96 +16,88 @@ class RUTVIE(InfoExtractor):
)
(?P<id>\d+)
'''
_EMBED_URLS = [
_EMBED_REGEX = [
r'<iframe[^>]+?src=(["\'])(?P<url>https?://(?:test)?player\.(?:rutv\.ru|vgtrk\.com)/(?:iframe/(?:swf|video|live)/id|index/iframe/cast_id)/.+?)\1',
r'<meta[^>]+?property=(["\'])og:video\1[^>]+?content=(["\'])(?P<url>https?://(?:test)?player\.(?:rutv\.ru|vgtrk\.com)/flash\d+v/container\.swf\?id=.+?\2)',
]
_TESTS = [
{
'url': 'http://player.rutv.ru/flash2v/container.swf?id=774471&sid=kultura&fbv=true&isPlay=true&ssl=false&i=560&acc_video_id=episode_id/972347/video_id/978186/brand_id/31724',
'info_dict': {
'id': '774471',
'ext': 'mp4',
'title': 'Монологи на все времена',
'description': 'md5:18d8b5e6a41fb1faa53819471852d5d5',
'duration': 2906,
},
'params': {
# m3u8 download
'skip_download': True,
},
_TESTS = [{
'url': 'http://player.rutv.ru/flash2v/container.swf?id=774471&sid=kultura&fbv=true&isPlay=true&ssl=false&i=560&acc_video_id=episode_id/972347/video_id/978186/brand_id/31724',
'info_dict': {
'id': '774471',
'ext': 'mp4',
'title': 'Монологи на все времена. Концерт',
'description': 'md5:18d8b5e6a41fb1faa53819471852d5d5',
'duration': 2906,
'thumbnail': r're:https?://cdn-st2\.smotrim\.ru/.+\.jpg',
},
{
'url': 'https://player.vgtrk.com/flash2v/container.swf?id=774016&sid=russiatv&fbv=true&isPlay=true&ssl=false&i=560&acc_video_id=episode_id/972098/video_id/977760/brand_id/57638',
'info_dict': {
'id': '774016',
'ext': 'mp4',
'title': 'Чужой в семье Сталина',
'description': '',
'duration': 2539,
},
'params': {
# m3u8 download
'skip_download': True,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://player.vgtrk.com/flash2v/container.swf?id=774016&sid=russiatv&fbv=true&isPlay=true&ssl=false&i=560&acc_video_id=episode_id/972098/video_id/977760/brand_id/57638',
'info_dict': {
'id': '774016',
'ext': 'mp4',
'title': 'Чужой в семье Сталина',
'description': '',
'duration': 2539,
},
{
'url': 'http://player.rutv.ru/iframe/swf/id/766888/sid/hitech/?acc_video_id=4000',
'info_dict': {
'id': '766888',
'ext': 'mp4',
'title': 'Вести.net: интернет-гиганты начали перетягивание программных "одеял"',
'description': 'md5:65ddd47f9830c4f42ed6475f8730c995',
'duration': 279,
},
'params': {
# m3u8 download
'skip_download': True,
},
'skip': 'Invalid URL',
}, {
'url': 'http://player.rutv.ru/iframe/swf/id/766888/sid/hitech/?acc_video_id=4000',
'info_dict': {
'id': '766888',
'ext': 'mp4',
'title': 'Вести.net: интернет-гиганты начали перетягивание программных "одеял"',
'description': 'md5:65ddd47f9830c4f42ed6475f8730c995',
'duration': 279,
'thumbnail': r're:https?://cdn-st2\.smotrim\.ru/.+\.jpg',
},
{
'url': 'http://player.rutv.ru/iframe/video/id/771852/start_zoom/true/showZoomBtn/false/sid/russiatv/?acc_video_id=episode_id/970443/video_id/975648/brand_id/5169',
'info_dict': {
'id': '771852',
'ext': 'mp4',
'title': 'Прямой эфир. Жертвы загадочной болезни: смерть от старости в 17 лет',
'description': 'md5:b81c8c55247a4bd996b43ce17395b2d8',
'duration': 3096,
},
'params': {
# m3u8 download
'skip_download': True,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'http://player.rutv.ru/iframe/video/id/771852/start_zoom/true/showZoomBtn/false/sid/russiatv/?acc_video_id=episode_id/970443/video_id/975648/brand_id/5169',
'info_dict': {
'id': '771852',
'ext': 'mp4',
'title': 'Прямой эфир. Жертвы загадочной болезни: смерть от старости в 17 лет',
'description': 'md5:b81c8c55247a4bd996b43ce17395b2d8',
'duration': 3096,
'thumbnail': r're:https?://cdn-st2\.smotrim\.ru/.+\.jpg',
},
{
'url': 'http://player.rutv.ru/iframe/live/id/51499/showZoomBtn/false/isPlay/true/sid/sochi2014',
'info_dict': {
'id': '51499',
'ext': 'flv',
'title': 'Сочи-2014. Биатлон. Индивидуальная гонка. Мужчины ',
'description': 'md5:9e0ed5c9d2fa1efbfdfed90c9a6d179c',
},
'skip': 'Translation has finished',
'params': {'skip_download': 'm3u8'},
}, {
'url': 'http://player.rutv.ru/iframe/live/id/51499/showZoomBtn/false/isPlay/true/sid/sochi2014',
'info_dict': {
'id': '51499',
'ext': 'flv',
'title': 'Сочи-2014. Биатлон. Индивидуальная гонка. Мужчины ',
'description': 'md5:9e0ed5c9d2fa1efbfdfed90c9a6d179c',
},
{
'url': 'http://player.rutv.ru/iframe/live/id/21/showZoomBtn/false/isPlay/true/',
'info_dict': {
'id': '21',
'ext': 'mp4',
'title': 're:^Россия 24. Прямой эфир [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'is_live': True,
},
'params': {
# m3u8 download
'skip_download': True,
},
'skip': 'Invalid URL',
}, {
'url': 'http://player.rutv.ru/iframe/live/id/21/showZoomBtn/false/isPlay/true/',
'info_dict': {
'id': '21',
'ext': 'mp4',
'title': str,
'is_live': True,
},
{
'url': 'https://testplayer.vgtrk.com/iframe/live/id/19201/showZoomBtn/false/isPlay/true/',
'only_matching': True,
'skip': 'Invalid URL',
}, {
'url': 'https://testplayer.vgtrk.com/iframe/live/id/19201/showZoomBtn/false/isPlay/true/',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'http://istoriya-teatra.ru/news/item/f00/s05/n0000545/index.shtml',
'info_dict': {
'id': '1952012',
'ext': 'mp4',
'title': 'Новости культуры. Эфир от 10.10.2019 (23:30). Театр Сатиры отмечает день рождения премьерой',
'description': 'md5:fced27112ff01ff8fc4a452fc088bad6',
'duration': 191,
'thumbnail': r're:https?://cdn-st2\.smotrim\.ru/.+\.jpg',
},
]
'params': {'skip_download': 'm3u8'},
}]
def _real_extract(self, url):
mobj = self._match_valid_url(url)

View File

@@ -18,6 +18,7 @@ from ..utils import (
class RuutuIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'''(?x)
https?://
(?:
@@ -26,112 +27,111 @@ class RuutuIE(InfoExtractor):
)
(?P<id>\d+)
'''
_TESTS = [
{
'url': 'http://www.ruutu.fi/video/2058907',
'md5': 'ab2093f39be1ca8581963451b3c0234f',
'info_dict': {
'id': '2058907',
'ext': 'mp4',
'title': 'Oletko aina halunnut tietää mitä tapahtuu vain hetki ennen lähetystä? - Nyt se selvisi!',
'description': 'md5:cfc6ccf0e57a814360df464a91ff67d6',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 114,
'age_limit': 0,
'upload_date': '20150508',
},
_TESTS = [{
'url': 'http://www.ruutu.fi/video/2058907',
'md5': 'ab2093f39be1ca8581963451b3c0234f',
'info_dict': {
'id': '2058907',
'ext': 'mp4',
'title': 'Oletko aina halunnut tietää mitä tapahtuu vain hetki ennen lähetystä? - Nyt se selvisi!',
'description': 'md5:cfc6ccf0e57a814360df464a91ff67d6',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 114,
'age_limit': 0,
'upload_date': '20150508',
},
{
'url': 'http://www.ruutu.fi/video/2057306',
'md5': '065a10ae4d5b8cfd9d0c3d332465e3d9',
'info_dict': {
'id': '2057306',
'ext': 'mp4',
'title': 'Superpesis: katso koko kausi Ruudussa',
'description': 'md5:bfb7336df2a12dc21d18fa696c9f8f23',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 40,
'age_limit': 0,
'upload_date': '20150507',
'series': 'Superpesis',
'categories': ['Urheilu'],
},
}, {
'url': 'http://www.ruutu.fi/video/2057306',
'md5': '065a10ae4d5b8cfd9d0c3d332465e3d9',
'info_dict': {
'id': '2057306',
'ext': 'mp4',
'title': 'Superpesis: katso koko kausi Ruudussa',
'description': 'md5:bfb7336df2a12dc21d18fa696c9f8f23',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 40,
'age_limit': 0,
'upload_date': '20150507',
'series': 'Superpesis',
'categories': ['Urheilu'],
},
{
'url': 'http://www.supla.fi/supla/2231370',
'md5': 'df14e782d49a2c0df03d3be2a54ef949',
'info_dict': {
'id': '2231370',
'ext': 'mp4',
'title': 'Osa 1: Mikael Jungner',
'description': 'md5:7d90f358c47542e3072ff65d7b1bcffe',
'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 0,
'upload_date': '20151012',
'series': 'Läpivalaisu',
},
}, {
'url': 'http://www.supla.fi/supla/2231370',
'md5': 'df14e782d49a2c0df03d3be2a54ef949',
'info_dict': {
'id': '2231370',
'ext': 'mp4',
'title': 'Osa 1: Mikael Jungner',
'description': 'md5:7d90f358c47542e3072ff65d7b1bcffe',
'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 0,
'upload_date': '20151012',
'series': 'Läpivalaisu',
},
}, {
# Episode where <SourceFile> is "NOT-USED", but has other
# downloadable sources available.
{
'url': 'http://www.ruutu.fi/video/3193728',
'only_matching': True,
'url': 'http://www.ruutu.fi/video/3193728',
'only_matching': True,
}, {
# audio podcast
'url': 'https://www.supla.fi/supla/3382410',
'md5': 'b9d7155fed37b2ebf6021d74c4b8e908',
'info_dict': {
'id': '3382410',
'ext': 'mp3',
'title': 'Mikä ihmeen poltergeist?',
'description': 'md5:bbb6963df17dfd0ecd9eb9a61bf14b52',
'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 0,
'upload_date': '20190320',
'series': 'Mysteeritarinat',
'duration': 1324,
},
{
# audio podcast
'url': 'https://www.supla.fi/supla/3382410',
'md5': 'b9d7155fed37b2ebf6021d74c4b8e908',
'info_dict': {
'id': '3382410',
'ext': 'mp3',
'title': 'Mikä ihmeen poltergeist?',
'description': 'md5:bbb6963df17dfd0ecd9eb9a61bf14b52',
'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 0,
'upload_date': '20190320',
'series': 'Mysteeritarinat',
'duration': 1324,
},
'expected_warnings': [
'HTTP Error 502: Bad Gateway',
'Failed to download m3u8 information',
],
'expected_warnings': [
'HTTP Error 502: Bad Gateway',
'Failed to download m3u8 information',
],
}, {
'url': 'http://www.supla.fi/audio/2231370',
'only_matching': True,
}, {
'url': 'https://static.nelonenmedia.fi/player/misc/embed_player.html?nid=3618790',
'only_matching': True,
}, {
# episode
'url': 'https://www.ruutu.fi/video/3401964',
'info_dict': {
'id': '3401964',
'ext': 'mp4',
'title': 'Temptation Island Suomi - Kausi 5 - Jakso 17',
'description': 'md5:87cf01d5e1e88adf0c8a2937d2bd42ba',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 2582,
'age_limit': 12,
'upload_date': '20190508',
'series': 'Temptation Island Suomi',
'season_number': 5,
'episode_number': 17,
'categories': ['Reality ja tositapahtumat', 'Kotimaiset suosikit', 'Romantiikka ja parisuhde'],
},
{
'url': 'http://www.supla.fi/audio/2231370',
'only_matching': True,
'params': {
'skip_download': True,
},
{
'url': 'https://static.nelonenmedia.fi/player/misc/embed_player.html?nid=3618790',
'only_matching': True,
}, {
# premium
'url': 'https://www.ruutu.fi/video/3618715',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
# FIXME: Broken IE
'url': 'https://www.hs.fi/maailma/art-2000011353059.html',
'info_dict': {
'id': '4746675',
'ext': 'mp4',
'title': 'Yhdysvaltojen Texasin osavaltiota ovat koetelleet tuhoisat tulvat',
},
{
# episode
'url': 'https://www.ruutu.fi/video/3401964',
'info_dict': {
'id': '3401964',
'ext': 'mp4',
'title': 'Temptation Island Suomi - Kausi 5 - Jakso 17',
'description': 'md5:87cf01d5e1e88adf0c8a2937d2bd42ba',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 2582,
'age_limit': 12,
'upload_date': '20190508',
'series': 'Temptation Island Suomi',
'season_number': 5,
'episode_number': 17,
'categories': ['Reality ja tositapahtumat', 'Kotimaiset suosikit', 'Romantiikka ja parisuhde'],
},
'params': {
'skip_download': True,
},
},
{
# premium
'url': 'https://www.ruutu.fi/video/3618715',
'only_matching': True,
},
]
}]
_API_BASE = 'https://gatling.nelonenmedia.fi'
@classmethod

View File

@@ -23,13 +23,10 @@ class SenateISVPIE(InfoExtractor):
'id': 'judiciary031715',
'ext': 'mp4',
'title': 'ISVP',
'thumbnail': r're:^https?://.*\.(?:jpg|png)$',
'thumbnail': r're:https?://.+\.(?:jpe?g|png)',
'_old_archive_ids': ['senategov judiciary031715'],
},
'params': {
# m3u8 download
'skip_download': True,
},
'params': {'skip_download': 'm3u8'},
'expected_warnings': ['Failed to download m3u8 information'],
}, {
'url': 'http://www.senate.gov/isvp/?type=live&comm=commerce&filename=commerce011514.mp4&auto_play=false',
@@ -39,10 +36,6 @@ class SenateISVPIE(InfoExtractor):
'title': 'Integrated Senate Video Player',
'_old_archive_ids': ['senategov commerce011514'],
},
'params': {
# m3u8 download
'skip_download': True,
},
'skip': 'This video is not available.',
}, {
'url': 'http://www.senate.gov/isvp/?type=arch&comm=intel&filename=intel090613&hc_location=ufi',
@@ -60,7 +53,7 @@ class SenateISVPIE(InfoExtractor):
'id': 'help090920',
'ext': 'mp4',
'title': 'ISVP',
'thumbnail': 'https://www.help.senate.gov/assets/images/video-poster.png',
'thumbnail': r're:https?://.+\.(?:jpe?g|png)',
'_old_archive_ids': ['senategov help090920'],
},
}, {
@@ -68,6 +61,17 @@ class SenateISVPIE(InfoExtractor):
'url': 'http://www.senate.gov/isvp?type=live&comm=banking&filename=banking012715',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
# FIXME: Embed detection
'url': 'https://www.hsgac.senate.gov/subcommittees/bmfwra/hearings/match-ready-oversight-of-the-federal-governments-border-management-and-personnel-readiness-efforts-for-the-decade-of-sports/',
'info_dict': {
'id': 'govtaff061025',
'ext': 'mp4',
'title': 'ISVP',
'thumbnail': r're:https?://.+\.(?:jpe?g|png)',
'_old_archive_ids': ['senategov govtaff061025'],
},
}]
_COMMITTEES = {
'ag': ('76440', 'https://ag-f.akamaihd.net', '2036803', 'agriculture'),
@@ -150,10 +154,10 @@ class SenateGovIE(InfoExtractor):
'id': 'help090920',
'display_id': 'vaccines-saving-lives-ensuring-confidence-and-protecting-public-health',
'title': 'Vaccines: Saving Lives, Ensuring Confidence, and Protecting Public Health',
'description': 'The U.S. Senate Committee on Health, Education, Labor & Pensions',
'description': 'Full Committee Hearing on September 9, 2020 at 6:00 AM',
'ext': 'mp4',
'age_limit': 0,
'thumbnail': 'https://www.help.senate.gov/assets/images/sharelogo.jpg',
'thumbnail': r're:https?://.+\.(?:jpe?g|png)',
'_old_archive_ids': ['senategov help090920'],
},
'params': {'skip_download': 'm3u8'},
@@ -165,7 +169,7 @@ class SenateGovIE(InfoExtractor):
'title': 'Review of the FY2019 Budget Request for the U.S. Army',
'ext': 'mp4',
'age_limit': 0,
'thumbnail': 'https://www.appropriations.senate.gov/themes/appropriations/images/video-poster-flash-fit.png',
'thumbnail': r're:https?://.+\.(?:jpe?g|png)',
'_old_archive_ids': ['senategov appropsA051518'],
},
'params': {'skip_download': 'm3u8'},
@@ -178,7 +182,7 @@ class SenateGovIE(InfoExtractor):
'title': '21st Century Communities: Public Transportation Infrastructure Investment and FAST Act Reauthorization',
'description': 'The Official website of The United States Committee on Banking, Housing, and Urban Affairs',
'ext': 'mp4',
'thumbnail': 'https://www.banking.senate.gov/themes/banking/images/sharelogo.jpg',
'thumbnail': r're:https?://.+\.(?:jpe?g|png)',
'age_limit': 0,
'_old_archive_ids': ['senategov banking041521'],
},

34
yt_dlp/extractor/shiey.py Normal file
View File

@@ -0,0 +1,34 @@
import json
from .common import InfoExtractor
from .vimeo import VimeoIE
from ..utils import extract_attributes
from ..utils.traversal import find_element, traverse_obj
class ShieyIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?shiey\.com/videos/v/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.shiey.com/videos/v/train-journey-to-edge-of-serbia-ep-2',
'info_dict': {
'id': '1103409448',
'ext': 'mp4',
'title': 'Train Journey To Edge of Serbia (Ep. 2)',
'uploader': 'shiey',
'uploader_url': '',
'duration': 1364,
'thumbnail': r're:^https?://.+',
},
'params': {'skip_download': True},
'expected_warnings': ['Failed to parse XML: not well-formed'],
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
oembed_html = traverse_obj(webpage, (
{find_element(attr='data-controller', value='VideoEmbed', html=True)},
{extract_attributes}, 'data-config-embed-video', {json.loads}, 'oembedHtml', {str}))
return self.url_result(VimeoIE._extract_url(url, oembed_html), VimeoIE)

View File

@@ -76,17 +76,18 @@ class SimplecastIE(SimplecastBaseIE):
'id': 'b6dc49a2-9404-4853-9aa9-9cfc097be876',
'ext': 'mp3',
'title': 'Errant Signal - Chris Franklin & New Wave Video Essays',
'channel_url': 'https://the-re-bind-io-podcast.simplecast.com',
'episode': 'Episode 1',
'episode_number': 1,
'episode_id': 'b6dc49a2-9404-4853-9aa9-9cfc097be876',
'description': 'md5:34752789d3d2702e2d2c975fbd14f357',
'season': 'Season 1',
'season_number': 1,
'season_id': 'e23df0da-bae4-4531-8bbf-71364a88dc13',
'series': 'The RE:BIND.io Podcast',
'duration': 5343,
'timestamp': 1580979475,
'upload_date': '20200206',
'webpage_url': r're:^https?://the-re-bind-io-podcast\.simplecast\.com/episodes/errant-signal-chris-franklin-new-wave-video-essays',
'channel_url': r're:^https?://the-re-bind-io-podcast\.simplecast\.com$',
}
_TESTS = [{
'url': 'https://api.simplecast.com/episodes/b6dc49a2-9404-4853-9aa9-9cfc097be876',
@@ -96,6 +97,29 @@ class SimplecastIE(SimplecastBaseIE):
'url': 'https://player.simplecast.com/b6dc49a2-9404-4853-9aa9-9cfc097be876',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
# FIXME: Embed detection
'url': 'https://poddtoppen.se/podcast/1498417306/the-rebindio-podcast/errant-signal-chris-franklin-new-wave-video-essays',
'md5': '8c93be7be54251bf29ee97464eabe61c',
'info_dict': {
'id': 'b6dc49a2-9404-4853-9aa9-9cfc097be876',
'ext': 'mp3',
'title': 'Errant Signal - Chris Franklin & New Wave Video Essays',
'channel_url': 'https://the-re-bind-io-podcast.simplecast.com',
'description': 'md5:34752789d3d2702e2d2c975fbd14f357',
'display_id': 'errant-signal-chris-franklin-new-wave-video-essays',
'duration': 5343,
'episode': 'Episode 1',
'episode_id': 'b6dc49a2-9404-4853-9aa9-9cfc097be876',
'episode_number': 1,
'season': 'Season 1',
'season_id': 'e23df0da-bae4-4531-8bbf-71364a88dc13',
'season_number': 1,
'series': 'The RE:BIND.io Podcast',
'timestamp': 1580979475,
'upload_date': '20200206',
},
}]
def _real_extract(self, url):
episode_id = self._match_id(url)
@@ -106,11 +130,11 @@ class SimplecastIE(SimplecastBaseIE):
class SimplecastEpisodeIE(SimplecastBaseIE):
IE_NAME = 'simplecast:episode'
_VALID_URL = r'https?://(?!api\.)[^/]+\.simplecast\.com/episodes/(?P<id>[^/?&#]+)'
_TEST = {
_TESTS = [{
'url': 'https://the-re-bind-io-podcast.simplecast.com/episodes/errant-signal-chris-franklin-new-wave-video-essays',
'md5': '8c93be7be54251bf29ee97464eabe61c',
'info_dict': SimplecastIE._COMMON_TEST_INFO,
}
}]
def _real_extract(self, url):
mobj = self._match_valid_url(url)
@@ -124,7 +148,7 @@ class SimplecastPodcastIE(SimplecastBaseIE):
_VALID_URL = r'https?://(?!(?:api|cdn|embed|feeds|player)\.)(?P<id>[^/]+)\.simplecast\.com(?!/episodes/[^/?&#]+)'
_TESTS = [{
'url': 'https://the-re-bind-io-podcast.simplecast.com',
'playlist_mincount': 33,
'playlist_mincount': 32,
'info_dict': {
'id': '07d28d26-7522-42eb-8c53-2bdcfc81c43c',
'title': 'The RE:BIND.io Podcast',

View File

@@ -26,11 +26,47 @@ from ..utils.traversal import traverse_obj
class SoundcloudEmbedIE(InfoExtractor):
_VALID_URL = r'https?://(?:w|player|p)\.soundcloud\.com/player/?.*?\burl=(?P<id>.+)'
_EMBED_REGEX = [r'<iframe[^>]+src=(["\'])(?P<url>(?:https?://)?(?:w\.)?soundcloud\.com/player.+?)\1']
_TEST = {
_TESTS = [{
# from https://www.soundi.fi/uutiset/ennakkokuuntelussa-timo-kaukolammen-station-to-station-to-station-julkaisua-juhlitaan-tanaan-g-livelabissa/
'url': 'https://w.soundcloud.com/player/?visual=true&url=https%3A%2F%2Fapi.soundcloud.com%2Fplaylists%2F922213810&show_artwork=true&maxwidth=640&maxheight=960&dnt=1&secret_token=s-ziYey',
'only_matching': True,
}
}]
_WEBPAGE_TESTS = [{
'url': 'https://news.sophos.com/en-us/2023/08/10/s3-ep147-what-if-you-type-in-your-password-during-a-meeting/',
'info_dict': {
'id': '1588847423',
'ext': 'm4a',
'title': 'S3 Ep147: What if you type in your password during a meeting?',
'artists': ['Naked Security'],
'description': 'md5:6931a0630b920413c8c904407bf4b3b2',
'duration': 942.762,
'genres': ['Technology'],
'license': 'all-rights-reserved',
'repost_count': int,
'tags': 'count:4',
'thumbnail': r're:https?://[ai]1\.sndcdn\.com/.+\.(?:jpg|png)',
'timestamp': 1691624365,
'track': 'S3 Ep147: What if you type in your password during a meeting?',
'upload_date': '20230809',
'uploader': 'Naked Security',
'uploader_id': '61390843',
'uploader_url': 'https://soundcloud.com/sophossecurity',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.guitarplayer.com/lessons/november-2023-guitar-player-lesson-audio',
'info_dict': {
'id': '1695754080',
'title': 'A Tribute to Brian Setzers Guitar Mastery',
'album': 'A Tribute to Brian Setzers Guitar Mastery',
'album_artists': ['Guitar Player'],
'album_type': 'playlist',
'description': '',
'uploader': 'Guitar Player',
'uploader_id': '489924156',
},
'playlist_mincount': 7,
}]
def _real_extract(self, url):
query = parse_qs(url)
@@ -407,269 +443,256 @@ class SoundcloudIE(SoundcloudBaseIE):
)
'''
IE_NAME = 'soundcloud'
_TESTS = [
{
'url': 'http://soundcloud.com/ethmusic/lostin-powers-she-so-heavy',
'md5': 'de9bac153e7427a7333b4b0c1b6a18d2',
'info_dict': {
'id': '62986583',
'ext': 'opus',
'title': 'Lostin Powers - She so Heavy (SneakPreview) Adrian Ackers Blueprint 1',
'track': 'Lostin Powers - She so Heavy (SneakPreview) Adrian Ackers Blueprint 1',
'description': 'No Downloads untill we record the finished version this weekend, i was too pumped n i had to post it , earl is prolly gonna b hella p.o\'d',
'uploader': 'E.T. ExTerrestrial Music',
'uploader_id': '1571244',
'timestamp': 1349920598,
'upload_date': '20121011',
'duration': 143.216,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'thumbnail': 'https://i1.sndcdn.com/artworks-000031955188-rwb18x-original.jpg',
'uploader_url': 'https://soundcloud.com/ethmusic',
'tags': 'count:14',
},
_TESTS = [{
'url': 'http://soundcloud.com/ethmusic/lostin-powers-she-so-heavy',
'md5': 'de9bac153e7427a7333b4b0c1b6a18d2',
'info_dict': {
'id': '62986583',
'ext': 'opus',
'title': 'Lostin Powers - She so Heavy (SneakPreview) Adrian Ackers Blueprint 1',
'track': 'Lostin Powers - She so Heavy (SneakPreview) Adrian Ackers Blueprint 1',
'description': 'md5:7b6074e00887ad79f59b647c8fb6d5ae',
'uploader': 'E.T. ExTerrestrial Music',
'uploader_id': '1571244',
'timestamp': 1349920598,
'upload_date': '20121011',
'duration': 143.216,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'thumbnail': r're:https?://[ai]1\.sndcdn\.com/.+\.(?:jpg|png)',
'uploader_url': 'https://soundcloud.com/ethmusic',
'tags': 'count:14',
},
# geo-restricted
{
'url': 'https://soundcloud.com/the-concept-band/goldrushed-mastered?in=the-concept-band/sets/the-royal-concept-ep',
'info_dict': {
'id': '47127627',
'ext': 'opus',
'title': 'Goldrushed',
'track': 'Goldrushed',
'description': 'From Stockholm Sweden\r\nPovel / Magnus / Filip / David\r\nwww.theroyalconcept.com',
'uploader': 'The Royal Concept',
'uploader_id': '9615865',
'timestamp': 1337635207,
'upload_date': '20120521',
'duration': 227.103,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'uploader_url': 'https://soundcloud.com/the-concept-band',
'thumbnail': 'https://i1.sndcdn.com/artworks-v8bFHhXm7Au6-0-original.jpg',
'genres': ['Alternative'],
'artists': ['The Royal Concept'],
'tags': [],
},
}, {
# Geo-restricted
'url': 'https://soundcloud.com/the-concept-band/goldrushed-mastered?in=the-concept-band/sets/the-royal-concept-ep',
'info_dict': {
'id': '47127627',
'ext': 'opus',
'title': 'Goldrushed',
'track': 'Goldrushed',
'description': 'md5:c0080b79a3710811d60234f94f391a40',
'uploader': 'The Royal Concept',
'uploader_id': '9615865',
'timestamp': 1337635207,
'upload_date': '20120521',
'duration': 227.103,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'uploader_url': 'https://soundcloud.com/the-concept-band',
'thumbnail': r're:https?://[ai]1\.sndcdn\.com/.+\.(?:jpg|png)',
'genres': ['Alternative'],
'artists': ['The Royal Concept'],
'tags': [],
},
}, {
# private link
{
'url': 'https://soundcloud.com/jaimemf/youtube-dl-test-video-a-y-baw/s-8Pjrp',
'md5': 'aa0dd32bfea9b0c5ef4f02aacd080604',
'info_dict': {
'id': '123998367',
'ext': 'mp3',
'title': 'Youtube - Dl Test Video \'\' Ä↭',
'track': 'Youtube - Dl Test Video \'\' Ä↭',
'description': 'test chars: "\'/\\ä↭',
'uploader': 'jaimeMF',
'uploader_id': '69767071',
'timestamp': 1386604920,
'upload_date': '20131209',
'duration': 9.927,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'uploader_url': 'https://soundcloud.com/jaimemf',
'thumbnail': 'https://a1.sndcdn.com/images/default_avatar_large.png',
'genres': ['youtubedl'],
'tags': [],
},
'url': 'https://soundcloud.com/jaimemf/youtube-dl-test-video-a-y-baw/s-8Pjrp',
'md5': 'aa0dd32bfea9b0c5ef4f02aacd080604',
'info_dict': {
'id': '123998367',
'ext': 'mp3',
'title': 'Youtube - Dl Test Video \'\' Ä↭',
'track': 'Youtube - Dl Test Video \'\' Ä↭',
'description': 'md5:610b729ee06ac4cedaa28607212948f3',
'uploader': 'jaimeMF',
'uploader_id': '69767071',
'timestamp': 1386604920,
'upload_date': '20131209',
'duration': 9.927,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'uploader_url': 'https://soundcloud.com/jaimemf',
'thumbnail': r're:https?://[ai]1\.sndcdn\.com/.+\.(?:jpg|png)',
'genres': ['youtubedl'],
'tags': [],
},
}, {
# private link (alt format)
{
'url': 'https://api.soundcloud.com/tracks/123998367?secret_token=s-8Pjrp',
'md5': 'aa0dd32bfea9b0c5ef4f02aacd080604',
'info_dict': {
'id': '123998367',
'ext': 'mp3',
'title': 'Youtube - Dl Test Video \'\' Ä↭',
'track': 'Youtube - Dl Test Video \'\' Ä↭',
'description': 'test chars: "\'/\\ä↭',
'uploader': 'jaimeMF',
'uploader_id': '69767071',
'timestamp': 1386604920,
'upload_date': '20131209',
'duration': 9.927,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'uploader_url': 'https://soundcloud.com/jaimemf',
'thumbnail': 'https://a1.sndcdn.com/images/default_avatar_large.png',
'genres': ['youtubedl'],
'tags': [],
},
'url': 'https://api.soundcloud.com/tracks/123998367?secret_token=s-8Pjrp',
'md5': 'aa0dd32bfea9b0c5ef4f02aacd080604',
'info_dict': {
'id': '123998367',
'ext': 'mp3',
'title': 'Youtube - Dl Test Video \'\' Ä↭',
'track': 'Youtube - Dl Test Video \'\' Ä↭',
'description': 'md5:610b729ee06ac4cedaa28607212948f3',
'uploader': 'jaimeMF',
'uploader_id': '69767071',
'timestamp': 1386604920,
'upload_date': '20131209',
'duration': 9.927,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'uploader_url': 'https://soundcloud.com/jaimemf',
'thumbnail': r're:https?://[ai]1\.sndcdn\.com/.+\.(?:jpg|png)',
'genres': ['youtubedl'],
'tags': [],
},
}, {
# downloadable song
{
'url': 'https://soundcloud.com/the80m/the-following',
'md5': 'ecb87d7705d5f53e6c02a63760573c75', # wav: '9ffcddb08c87d74fb5808a3c183a1d04'
'info_dict': {
'id': '343609555',
'ext': 'opus', # wav original available with auth
'title': 'The Following',
'track': 'The Following',
'description': '',
'uploader': '80M',
'uploader_id': '312384765',
'uploader_url': 'https://soundcloud.com/the80m',
'upload_date': '20170922',
'timestamp': 1506120436,
'duration': 397.228,
'thumbnail': 'https://i1.sndcdn.com/artworks-000243916348-ktoo7d-original.jpg',
'license': 'all-rights-reserved',
'like_count': int,
'comment_count': int,
'repost_count': int,
'view_count': int,
'genres': ['Dance & EDM'],
'artists': ['80M'],
'tags': ['80M', 'EDM', 'Dance', 'Music'],
},
'expected_warnings': ['Original download format is only available for registered users'],
'url': 'https://soundcloud.com/the80m/the-following',
'md5': 'ecb87d7705d5f53e6c02a63760573c75', # wav: '9ffcddb08c87d74fb5808a3c183a1d04'
'info_dict': {
'id': '343609555',
'ext': 'opus', # wav original available with auth
'title': 'The Following',
'track': 'The Following',
'description': '',
'uploader': '80M',
'uploader_id': '312384765',
'uploader_url': 'https://soundcloud.com/the80m',
'upload_date': '20170922',
'timestamp': 1506120436,
'duration': 397.228,
'thumbnail': r're:https?://[ai]1\.sndcdn\.com/.+\.(?:jpg|png)',
'license': 'all-rights-reserved',
'like_count': int,
'comment_count': int,
'repost_count': int,
'view_count': int,
'genres': ['Dance & EDM'],
'artists': ['80M'],
'tags': 'count:4',
},
'expected_warnings': ['Original download format is only available for registered users'],
}, {
# private link, downloadable format
# tags with spaces (e.g. "Uplifting Trance", "Ori Uplift")
{
'url': 'https://soundcloud.com/oriuplift/uponly-238-no-talking-wav/s-AyZUd',
'md5': '2e1530d0e9986a833a67cb34fc90ece0', # wav: '64a60b16e617d41d0bef032b7f55441e'
'info_dict': {
'id': '340344461',
'ext': 'opus', # wav original available with auth
'title': 'Uplifting Only 238 [No Talking] (incl. Alex Feed Guestmix) (Aug 31, 2017) [wav]',
'track': 'Uplifting Only 238 [No Talking] (incl. Alex Feed Guestmix) (Aug 31, 2017) [wav]',
'description': 'md5:fa20ee0fca76a3d6df8c7e57f3715366',
'uploader': 'Ori Uplift Music',
'uploader_id': '12563093',
'timestamp': 1504206263,
'upload_date': '20170831',
'duration': 7449.096,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'thumbnail': 'https://i1.sndcdn.com/artworks-000240712245-kedn4p-original.jpg',
'uploader_url': 'https://soundcloud.com/oriuplift',
'genres': ['Trance'],
'artists': ['Ori Uplift'],
'tags': ['Orchestral', 'Emotional', 'Uplifting Trance', 'Trance', 'Ori Uplift', 'UpOnly'],
},
'expected_warnings': ['Original download format is only available for registered users'],
'url': 'https://soundcloud.com/oriuplift/uponly-238-no-talking-wav/s-AyZUd',
'md5': '2e1530d0e9986a833a67cb34fc90ece0', # wav: '64a60b16e617d41d0bef032b7f55441e'
'info_dict': {
'id': '340344461',
'ext': 'opus', # wav original available with auth
'title': 'Uplifting Only 238 [No Talking] (incl. Alex Feed Guestmix) (Aug 31, 2017) [wav]',
'track': 'Uplifting Only 238 [No Talking] (incl. Alex Feed Guestmix) (Aug 31, 2017) [wav]',
'description': 'md5:fa20ee0fca76a3d6df8c7e57f3715366',
'uploader': 'Ori Uplift Music',
'uploader_id': '12563093',
'timestamp': 1504206263,
'upload_date': '20170831',
'duration': 7449.096,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'thumbnail': r're:https?://[ai]1\.sndcdn\.com/.+\.(?:jpg|png)',
'uploader_url': 'https://soundcloud.com/oriuplift',
'genres': ['Trance'],
'artists': ['Ori Uplift'],
'tags': 'count:6',
},
'expected_warnings': ['Original download format is only available for registered users'],
}, {
# no album art, use avatar pic for thumbnail
{
'url': 'https://soundcloud.com/garyvee/sideways-prod-mad-real',
'md5': '59c7872bc44e5d99b7211891664760c2',
'info_dict': {
'id': '309699954',
'ext': 'mp3',
'title': 'Sideways (Prod. Mad Real)',
'track': 'Sideways (Prod. Mad Real)',
'description': 'md5:d41d8cd98f00b204e9800998ecf8427e',
'uploader': 'garyvee',
'uploader_id': '2366352',
'timestamp': 1488152409,
'upload_date': '20170226',
'duration': 207.012,
'thumbnail': r're:https?://.*\.jpg',
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'uploader_url': 'https://soundcloud.com/garyvee',
'artists': ['MadReal'],
'tags': [],
},
'params': {
'skip_download': True,
},
'url': 'https://soundcloud.com/garyvee/sideways-prod-mad-real',
'md5': '59c7872bc44e5d99b7211891664760c2',
'info_dict': {
'id': '309699954',
'ext': 'mp3',
'title': 'Sideways (Prod. Mad Real)',
'track': 'Sideways (Prod. Mad Real)',
'description': 'md5:d41d8cd98f00b204e9800998ecf8427e',
'uploader': 'garyvee',
'uploader_id': '2366352',
'timestamp': 1488152409,
'upload_date': '20170226',
'duration': 207.012,
'thumbnail': r're:https?://[ai]1\.sndcdn\.com/.+\.(?:jpg|png)',
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'uploader_url': 'https://soundcloud.com/garyvee',
'artists': ['MadReal'],
'tags': [],
},
{
'url': 'https://soundcloud.com/giovannisarani/mezzo-valzer',
'md5': '8227c3473a4264df6b02ad7e5b7527ac',
'info_dict': {
'id': '583011102',
'ext': 'opus',
'title': 'Mezzo Valzer',
'track': 'Mezzo Valzer',
'description': 'md5:f4d5f39d52e0ccc2b4f665326428901a',
'uploader': 'Giovanni Sarani',
'uploader_id': '3352531',
'timestamp': 1551394171,
'upload_date': '20190228',
'duration': 180.157,
'thumbnail': r're:https?://.*\.jpg',
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'genres': ['Piano'],
'uploader_url': 'https://soundcloud.com/giovannisarani',
'tags': 'count:10',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://soundcloud.com/giovannisarani/mezzo-valzer',
'md5': '8227c3473a4264df6b02ad7e5b7527ac',
'info_dict': {
'id': '583011102',
'ext': 'm4a',
'title': 'Mezzo Valzer',
'track': 'Mezzo Valzer',
'description': 'md5:f4d5f39d52e0ccc2b4f665326428901a',
'uploader': 'Giovanni Sarani',
'uploader_id': '3352531',
'timestamp': 1551394171,
'upload_date': '20190228',
'duration': 180.134,
'thumbnail': r're:https?://[ai]1\.sndcdn\.com/.+\.(?:jpg|png)',
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
'genres': ['Piano'],
'uploader_url': 'https://soundcloud.com/giovannisarani',
'tags': 'count:10',
},
'params': {'skip_download': 'm3u8'},
}, {
# .png "original" artwork, 160kbps m4a HLS format
{
'url': 'https://soundcloud.com/skorxh/audio-dealer',
'info_dict': {
'id': '2011421339',
'ext': 'm4a',
'title': 'audio dealer',
'description': '',
'uploader': '$KORCH',
'uploader_id': '150292288',
'uploader_url': 'https://soundcloud.com/skorxh',
'comment_count': int,
'view_count': int,
'like_count': int,
'repost_count': int,
'duration': 213.469,
'tags': [],
'artists': ['$KORXH'],
'track': 'audio dealer',
'timestamp': 1737143201,
'upload_date': '20250117',
'license': 'all-rights-reserved',
'thumbnail': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-original.png',
'thumbnails': [
{'id': 'mini', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-mini.jpg'},
{'id': 'tiny', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-tiny.jpg'},
{'id': 'small', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-small.jpg'},
{'id': 'badge', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-badge.jpg'},
{'id': 't67x67', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-t67x67.jpg'},
{'id': 'large', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-large.jpg'},
{'id': 't300x300', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-t300x300.jpg'},
{'id': 'crop', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-crop.jpg'},
{'id': 't500x500', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-t500x500.jpg'},
{'id': 'original', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-original.png'},
],
},
'params': {'skip_download': 'm3u8', 'format': 'hls_aac_160k'},
'url': 'https://soundcloud.com/skorxh/audio-dealer',
'info_dict': {
'id': '2011421339',
'ext': 'm4a',
'title': 'audio dealer',
'description': '',
'uploader': '$KORCH',
'uploader_id': '150292288',
'uploader_url': 'https://soundcloud.com/skorxh',
'comment_count': int,
'view_count': int,
'like_count': int,
'repost_count': int,
'duration': 213.469,
'tags': [],
'artists': ['$KORXH'],
'track': 'audio dealer',
'timestamp': 1737143201,
'upload_date': '20250117',
'license': 'all-rights-reserved',
'thumbnail': r're:https?://[ai]1\.sndcdn\.com/.+\.(?:jpg|png)',
'thumbnails': [
{'id': 'mini', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-mini.jpg'},
{'id': 'tiny', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-tiny.jpg'},
{'id': 'small', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-small.jpg'},
{'id': 'badge', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-badge.jpg'},
{'id': 't67x67', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-t67x67.jpg'},
{'id': 'large', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-large.jpg'},
{'id': 't300x300', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-t300x300.jpg'},
{'id': 'crop', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-crop.jpg'},
{'id': 't500x500', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-t500x500.jpg'},
{'id': 'original', 'url': 'https://i1.sndcdn.com/artworks-a1wKGMYNreDLTMrT-fGjRiw-original.png'},
],
},
{
# AAC HQ format available (account with active subscription needed)
'url': 'https://soundcloud.com/wandw/the-chainsmokers-ft-daya-dont-let-me-down-ww-remix-1',
'only_matching': True,
},
{
# Go+ (account with active subscription needed)
'url': 'https://soundcloud.com/taylorswiftofficial/look-what-you-made-me-do',
'only_matching': True,
},
]
'params': {'skip_download': 'm3u8', 'format': 'hls_aac_160k'},
}, {
# AAC HQ format available (account with active subscription needed)
'url': 'https://soundcloud.com/wandw/the-chainsmokers-ft-daya-dont-let-me-down-ww-remix-1',
'only_matching': True,
}, {
# Go+ (account with active subscription needed)
'url': 'https://soundcloud.com/taylorswiftofficial/look-what-you-made-me-do',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = self._match_valid_url(url)
@@ -907,7 +930,7 @@ class SoundcloudUserIE(SoundcloudPagedPlaylistBaseIE):
'id': '7098329',
'title': 'Grynpyret (Spotlight)',
},
'playlist_mincount': 1,
'playlist_mincount': 0,
}, {
'url': 'https://soundcloud.com/one-thousand-and-one/comments',
'info_dict': {
@@ -998,7 +1021,7 @@ class SoundcloudRelatedIE(SoundcloudPagedPlaylistBaseIE):
'id': '1084577272',
'title': 'Sexapil - Pingers 5 (Recommended)',
},
'playlist_mincount': 50,
'playlist_mincount': 49,
}, {
'url': 'https://soundcloud.com/wajang/sexapil-pingers-5/albums',
'info_dict': {
@@ -1045,7 +1068,7 @@ class SoundcloudPlaylistIE(SoundcloudPlaylistBaseIE):
'info_dict': {
'id': '4110309',
'title': 'TILT Brass - Bowery Poetry Club, August \'03 [Non-Site SCR 02]',
'description': 're:.*?TILT Brass - Bowery Poetry Club',
'description': 'md5:e4373f7177fe3db292a8552b4ec41bc6',
'uploader': 'Non-Site Records',
'uploader_id': '33660914',
'album_artists': ['Non-Site Records'],

Some files were not shown because too many files have changed in this diff Show More